Evaluation Metrics in Machine Learning

Evaluation Metrics in Machine Learning - ROC / AUC

Receiver Operation Curve (ROC) and Area Under Curve (AUC) are important evaluation metrics used in evaluating the performance of the binary classification model.

ROC describes the change in True Positive Rate (TPR) according the change of False Positive Rate (FPR).

True Positive Rate is rate of the true positive classifications out of entire true label-values. In other words TPR is identical to recall-score we have studied in the previous post (TP / (TP + FN)) .

Converse to TPR, there is a True Negative Rate (TNR) also known as Specificity which is equal to (TN / (TN + FP)) .

Above is the example of ROC. Straight dotted line which connects the curve is the minimum value of ROC. ROC is better as the summit is farther away from the straight line.

ROC returns the TPR by adjusting the FPR in the range of 0 to 1. Using the threshold-adjustment we discussed previously, we will be adjusting the FPR to see the change in TPR.

Using Scikit-learn's roc_curve() API we will visualize the ROC for Titanic dataset.

Input

# use roc_curve() API 
from sklearn.metrics import roc_curve

pred_proba_class1 = lr_clf.predict_proba(X_test)[:,1]

fprs, tprs, thresholds = roc_curve(y_test, pred_proba_class1)
thr_index = np.arange(1, thresholds.shape[0], 5)

print('Threshold Value Index : ', thr_index)
print('Values of Thresholds : ', np.round(thresholds[thr_index],2))

# FPR & TPR Rate Value
print('FPRS : ', np.round(fprs[thr_index], 2))
print('TPRS : ', np.round(tprs[thr_index], 2))

Output

Threshold Value Index :  [ 1  6 11 16 21 26 31 36 41 46]
Values of Thresholds :  [0.94 0.73 0.62 0.52 0.44 0.28 0.15 0.14 0.13 0.12]
FPRS :  [0.   0.01 0.03 0.08 0.13 0.25 0.58 0.61 0.75 0.85]
TPRS :  [0.02 0.49 0.7  0.74 0.8  0.89 0.9  0.95 0.97 1.  ]

Input

def roc_curve_plot(y_test, pred_proba_c1):
    fprs, tprs, thresholds = roc_curve(y_test, pred_proba_c1)
    
    plt.plot(fprs, tprs, label='ROC')
    plt.plot([0,1], [0,1], 'k--', label='Random')
    
    start, end = plt.xlim()
    plt.xticks(np.round(np.arange(start, end, 0.1),2))
    plt.xlim(0,1); plt.ylim(0,1)
    plt.xlabel('FPR (1 - Sensitivity)'); plt.ylabel('TPR (Recall)')
    plt.legend(); plt.grid()
    
roc_curve_plot(y_test, pred_proba[:,1])

Output

We see that the summit of the curve is fairly distanced from the straight-line. Now we will estimate the AUC.

Input

pred_proba = lr_clf.predict_proba(X_test)[:, 1]
roc_score = roc_auc_score(y_test, pred_proba)
print('ROC AUC Value : {0:.4f}'.format(roc_score))

Output

ROC AUC Value : 0.8987

Author And Source

이 문제에 관하여(Evaluation Metrics in Machine Learning - ROC / AUC), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://velog.io/@jiselectric/Evaluation-Metrics-in-Machine-Learning-ROC-AUC-bq48rsv9

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다