During a layover on a Chrismas vacation to Alaska, I asked my friend who majors in statistic what an ROC curve is--a concept that had confused me for a long time. Although my friend gave me a detailed explanation, the only conclusion I remember is that the farther the curve is above from the y=x line, the better the ROC curve is. But how? Back home from vacation, I decided to figure it out.
The x & y axis for ROC curve are False positive rate (FPR) and True positive rate (TPR) respectively. What are they? let me use a simply example from wikipedia to explain these concepts:
"imagine a study evaluating a test that screens people for a disease. Each person taking the test either has or does not have the disease. The test outcome can be positive (classifying the person as having the disease) or negative (classifying the person as not having the disease). The test results for each subject may or may not match the subject's actual status. In that setting:
- True positive: Sick people correctly identified as sick
- False positive: Healthy people incorrectly identified as sick
- True negative: Healthy people correctly identified as healthy
- False negative: Sick people incorrectly identified as healthy
after getting the numbers of true positives, false positives, true negatives, and false negatives, the sensitivity and specificity for the test can be calculated. If it turns out the sensitivity is high then any person who has the diseases is likely to be classified as positive by the test. On the other hand, if the specificity is high, any person who does not have the disease is likely to be classified as negative by the test. "
They also give two straightforward pictures to illustrate concepts of sensitivity and specificity:
As for calculation of sensitivity, specificity, and FPR, we have:
sensitivity (TPR) = TP / (TP + FN)
specificity = TN / (TN + FP)
FPR = FP / (TN + FP) = 1 - specificity
Apparently, we hope TPR is as high as possible, and this is why the ROC curve should be farther above away y = x line.


No comments:
Post a Comment