Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ROC AUC value is 0

I have trained a binary classifier, but I think that my ROC curve is incorrect.

This is the vector that contains labels:

y_true= [0, 1, 1, 1, 0, 1, 0, 1, 0]

and the second vector is the score vector

y_score= [
    0.43031937, 0.09115553, 0.00650781, 0.02242869, 0.38608587, 
    0.09407699, 0.40521139, 0.08062053, 0.37445426
]

When I plot my ROC curve, I get the following:

enter image description here

I think the code is correct, but I don't understand why I'm getting this curve and why the tpr, fpr, and threshold lists are of length 4. Why is my AUC is equal to zero?

fpr [0.   0.25 1.   1.  ]
tpr [0. 0. 0. 1.]
thershold [1.43031937 0.43031937 0.37445426 0.00650781]

My Code:

import sklearn.metrics as metrics

fpr, tpr, threshold = metrics.roc_curve(y_true, y_score)
roc_auc = metrics.auc(fpr, tpr)

# method I: plt
import matplotlib.pyplot as plt
plt.title('Receiver Operating Characteristic')
plt.plot(fpr, tpr, 'b', label = 'AUC = %0.2f' % roc_auc)
plt.legend(loc = 'lower right')
plt.plot([0, 1], [0, 1],'r--')
plt.xlim([0, 1])
plt.ylim([0, 1])
plt.ylabel('True Positive Rate')
plt.xlabel('False Positive Rate')
plt.show()
like image 572
Guizmo Charo Avatar asked Oct 20 '25 21:10

Guizmo Charo


2 Answers

One thing to keep in mind about AUC is that what's really important is distance from 0.5. If you have a really low AUC, that just means that your "positive" and "negative" labels are switched.

Looking at your scores, it's clear that a low score (anything less than ~0.095) means a 1 and anything above that threshold is a 0. So you actually have a great binary classifier!

The problem is that by default, higher scores are associated with the label 1. So you're labeling points with high scores as 1's instead of 0's. Thus you're wrong 100% of the time. In that case, just switch your predictions and you'll be correct 100% of the time.

The simple fix is to use the pos_label argument to sklearn.metrics.roc_curve. In this case you want your positive label to be 0.

fpr, tpr, threshold = metrics.roc_curve(y_true, y_score, pos_label=0)
roc_auc = metrics.auc(fpr, tpr)
print(roc_auc)
#1.0
like image 87
pault Avatar answered Oct 22 '25 11:10

pault


What @pault stated is misleading

If you have a really low AUC, that just means that your "positive" and "negative" labels are switched.

AUC=0 implies that

  • all truly positive data points are classified as negative or
  • all truly negative data points are classified as positive.

AUC=1 implies that there is a threshold, that can perfectly separate the data.

like image 30
Talos Avatar answered Oct 22 '25 11:10

Talos



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!