We can't find the internet
Attempting to reconnect
Something went wrong!
Attempting to reconnect
AUC-ROC
Compute the Area Under the ROC Curve (AUC-ROC) for a binary classifier.
What is the ROC curve?
The Receiver Operating Characteristic (ROC) curve plots the True Positive Rate (TPR / Recall) against the False Positive Rate (FPR) as you sweep the classification threshold from 1 down to 0:
TPR = TP / (TP + FN) (fraction of positives correctly retrieved)
FPR = FP / (FP + TN) (fraction of negatives incorrectly flagged)
A random classifier follows the diagonal (AUC = 0.5). A perfect classifier hugs the top-left corner (AUC = 1.0). An inverted classifier scores below the diagonal (AUC < 0.5).
AUC interpretation
AUC equals the probability that a randomly drawn positive example scores higher than a randomly drawn negative example:
AUC = P(score(pos) > score(neg))
This makes it threshold-independent and robust to class imbalance — you do not need to pick a decision boundary before computing it.
Rank-based formula (equivalent to trapezoidal AUC)
Instead of integrating the curve, count concordant pairs directly:
AUC = [Σ_{i∈pos, j∈neg} (1 if score_i > score_j, else 0.5 if equal, else 0)]
/ (n_pos × n_neg)
This is identical to the trapezoidal rule applied to the ROC curve and is
the same quantity computed by sklearn.metrics.roc_auc_score.
Edge case
If there are no positive examples (n_pos == 0) or no negative examples
(n_neg == 0), the ROC curve is undefined — return 0.0.
When to use AUC-ROC
- Imbalanced binary classification (spam, fraud, medical diagnosis).
- Comparing models without committing to a threshold.
- When both TPR and FPR matter equally across all operating points.
Inputs
-
scores: shape(N,)— binary classifier scores (higher = more positive). -
labels: shape(N,)—{0, 1}true labels (delivered as floats).
Output
Scalar AUC in [0, 1].
Hints
Sign in to attempt this problem and view the solution.