0.000.020.040.060.080.100.120.140.160.180.200.220.240.260.280.3000.10.20.30.40.50.60.70.80.91Maximum Tumor ConcavityProbability of Malignancy

Binary Predictions Metrics

This visual explanation introduces the metrics of model fit used when predicting of binary outcomes. It uses the challenge of classifying tumors as benign or malignant to explore the importance of these metrics.

Cancer Data

A common analytical challenge is classifying an observation as a member of a group. As we walk through this explanation, we'll consider this example

Can you use image data to classify tumors as benign or malignant?

We'll use the Wisconsin Breast Cancer dataset as a driving example.

Modeling the data

The dataset provided has 29 different attributes of each tumor. For simplicity's sake, we'll only use the one that is more correlated with the severity of the tumor.

The feature most related to the outcome of interest is the maximum concavity of each tumor. As you can see, there is clearly an association between a higher maximum concavity and the tumor's malignancy.

Logistic Predictions

There are a variety of statistical and machine learning techniques one could use to predict a binary outcome, though a popular one is the logistic regression (more on that another time).

Here, we can model the probability of malignancy using only the maximum concavity of the tumor.

Thresholds

The logistic regression allows us to model the probability of our outcome. If we want to convert these probabilities to binary outcomes (e.g., 0 or 1, benign or malignant), we'll need to pick a threshold

Drag the center line to adjust the threshold:

Threshold: 0.50

Accuracy

The most direct way that we can assess the fit of the model is using the model's accuracy. To compute this, we check the percentage of the time that the predicted class (benign, malignant) matches the data.

Accuracy=CorrectPredictionsTotalCases\displaystyle{Accuracy = \frac{Correct Predictions}{Total Cases}}

More formally, we could express this as:

Accuracy=n=iN1yi^yiN\displaystyle{Accuracy = \sum_{n = i}^{N}{\frac{1 - |\hat{y_i} - y_i|}N}{}}

Drag the center line see the accuracy of the model using different thresholds:

Accuracy: NaN%.

Vocabulary

In order to understand different metrics of success, it's important to understand the conventional vocabulary. Hover over any description to identify relevant points on the plot.

True Positives (TP): The model predicts a case (and the case is true in the data)

False Positives (FP): The model predicts a case (and the case is not true in the data)

True Negatives (TN): The model does not predict a case (and the case is not true in the data)

False Negatives (FN): The model does not predict a case (and the case is true in the data)

Sensitivity

Another consideration we could make is how well our model identifies positive cases. This metric, called the sensitivity, is calculated as:

Sensitivity=CorrectPosPredictionsAllPosCases\displaystyle{Sensitivity = \frac{Correct Pos Predictions}{All Pos Cases}}

This is often expressed using the vocabulary provided:

Sensitivity=TPTP+FN\displaystyle{Sensitivity = \frac{TP}{TP + FN}}

Drag the center line see the sensitivity of the model using different thresholds:

Sensitivity: NaN%.

Specificity

Conversely, we could consider how well our model identifies negative cases. This metric, called the specificity, is calculated as:

Specificity=CorrectNegPredictionsAllNegCases\displaystyle{Specificity = \frac{Correct Neg Predictions}{All Neg Cases}}

This is often expressed using these conventional terms:

Specificity=TNTN+FP\displaystyle{Specificity = \frac{TN}{TN + FP}}

Drag the center line see the sensitivity of the model using different thresholds:

Sensitivity: NaN%.

Specificity: NaN%.

ROC Curve

This trade off between specificity and sensitivity is often displayed using a Receiver Operator Characteristic (ROC) Curve, which displays the following:

False Positive Rate: Along the x-axis, the false positive rate, which is 1 - specificity.

True Positive Rate: Along the y-axis, the true positive rate which is the sensitivity.

About

This project was built by Michael Freeman, a faculty member at the University of Washington Information School.

If you like this explanation, check out explanations of writing code to work with data in my new book!

All code for this project is on GitHub, including the script to create the data and run regressions (done inPython). Feel free to issue a pull request for improvements, and if you like it, share it on Twitter. Layout inspired by Tony Chu.