AUROC clearly explained
The worst thing you can do is start explaining ML metrics to stakeholders. I did it a lot. Don’t ever do it. Seriously. But here, in our cozy space with physics, mathematics and ML, we can be luxurious and start discussing how to measure the quality of a model, as promised some time ago.
Our setup:
👉 target — the thing we are predicting: survival — binary — 0/1, false/true
👉 model — any function you can imagine; takes all features, returns the Score
👉 Score — a number; small ~ false (0) target, big ~ true (1) target
In our informal language, we calculate the score for each passenger using the chain: passenger → features → model → Score. There is a question: whether this Score works, or what? How do we understand it?
Let’s cheat. For a moment, let’s use survived as the Score. Then ask passengers to form a rank by increasing score. Obviously, all survived people would be on the right end of this rank, and deceased people — on the left side. There would be a position in this rank where we can split it into two crowds. In the left crowd, for each passenger target = 0, and in the right — target = 1.
It gives us the idea. We ask our passengers to form a rank guided by score increase. Then we move from right to left and count survived and deceased. Just counting is not enough. At the end of the day, we would get only the number of passengers in these two classes. We need to memorize the dynamics of these two counts. But how?
A totally genius idea is to use these two meters as coordinates on the plane. Therefore, a survived passenger urges us to make a step up, and a deceased passenger — to make a step to the right.
If we cheated, we move all the way up, then all the way to the right. If passengers were randomly shuffled, we move in a narrow band near the diagonal of the rectangle.
Last step — scale the axes to transform our bounding rectangle into a square. Here we are. The coordinates are TPR and FPR, our trajectory is ROC, and the area under it is AUROC — Area Under the Receiver Operating Characteristic Curve.
I totally believe that you heard all this stuff. But I need a solid base for the madness which follows.
