More than 5 years have passed since last update.

Plot ROC Curve for Binary Classification with Matplotlib

Last updated at 2017-08-09Posted at 2017-07-22

ROC Curve and AUC

For evaluating a binary classification model, Area under the Curve is often used. AUC (In most cases, C represents ROC curve) is the size of area under the plotted curve. In ROC (Receiver operating characteristic) curve, true positive rates are plotted against false positive rates. The closer AUC of a model is getting to 1, the better the model is. It means, a model with higher AUC is preferred over those with lower AUC.

Calculating AUC is not so difficult as you can find scikit-learn module for AUC and all you need to do is passing your prediction vector and target score vector to AUC module. Then, the module calculates true positive rates and false positive rates automatically and returns AUC value.

Plot ROC Curve

The purpose of using AUC is to evaluate your model's prediction. As AUC returns results as numeric values. So just by comparing those numbers, you can pick which model has the best prediction performance. However, in my opinion, showing a plot instead of numbers are instinctively much convincible and easy to grab the concept. Therefore, not just calculating AUC, but also I tried to plot ROC Curve.

Following the instruction here http://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html, I implemented the code like below. Note that, in the example below, "predictions_test" contains prediction results by my model and "outcome_test" is target score for comparison.

from sklearn.metrics import roc_curve, auc
fpr, tpr, thresholds = roc_curve(predictions_test, outcome_test)
roc_auc = auc(fpr, tpr)

plt.figure()
plt.plot(fpr, tpr, color='darkorange', lw=1, label='ROC curve (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver operating characteristic')
plt.legend(loc="lower right")
plt.show()

Then, the following figure returned.

ROC Curve for binary classification

Successfully I was able to get ROC Curve polt, however, it is actually a little bit different from what I expected like below.

It seems like there are only 3 points (including [0,0] and [1,1]) in my ROC curve. It is not a curve at all. I wondered and googled it and I found out this is how ROC curve works.

According to the post: https://stackoverflow.com/questions/30051284/plotting-a-roc-curve-in-scikit-yields-only-3-points, the number of points in ROC curve depends on the number of unique value in input data. In this case, this is binary classification problem so input has only binary value 0 and 1. Therefore, ROC curve for my prediction model is nothing wrong.

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up