Let's look at models that attempt to predict classifications, for example, we'll consider the world of medicine, and in particular, the goal of determining whether or not a person has a particular disease. The two possible class values are yes and no. In other words, the model attempts to predict whether a person has the disease and the outcome is a yes, or doesn't have it, and the outcome is a no. The test itself produces numerical data that is input into our model, which then generates the yes or no prediction. Now, real-world tests are not perfect. Sometimes the test indicates a yes when the person doesn't really have the disease, and then no, when he or she in fact does have the disease. These are called false positives and false negatives respectively. Obviously, we want the false positive rate and the false negative rate to be as low as possible. Conversely, we have true positives and true negatives which reflect cases where the model made correct predictions. If you add up all the false positives, false negatives, true positives, and true negatives, you'll end up with the total number of tests. Statisticians sometimes use what's called a confusion matrix to keep track of these values. Let's put some numbers into our example. Suppose we run 200 tests and find that the model predicts 135 positive cases, with 125 of these really being positive, and 10 really being negative. The model also predicts that 65 of the cases are negative, with 50 of them truly being negative, and 15 truly being positive. Keep in mind that for our example, we're defining a yes outcome as being positive and a no outcome as being negative. The number of true positives is thus 125, and the number of true negatives is 50. We can calculate rates for these. The true positive rate, which is called the sensitivity, is given by the number of true positives, divided by the total number of actual positives, or 125 divided by 140, which gives us 0.893. In other words, the true positive rate or sensitivity is 89.3 percent. The true negative rate, also called the specificity, is given by the number of true negatives divided by the number of actual negatives, or 50 divided by 60, which gives us 0.833. In other words, the true negative rate or specificity is 83.3 percent. If you want to compare several models to determine which one is better for your particular situation, two useful rates are the accuracy, how often the model is correct and precision, how often the model is correct when the result is positive. The accuracy is calculated by taking the sum of the true positives and true negatives and dividing it by the total number of tests run. In our example, this is 125 plus 50, which gives 175, and then dividing it by 200. The result is 0.875 or 87.5 percent. The precision is calculated by taking the number of true positives and dividing it by the total number of predicted positives. For our example, we have 125 divided by 135, which gives us 0.926 or 92.6 percent. When picking the most appropriate model, you'll want to compare all these rates, sensitivity, specificity, accuracy, and precision. One final comment, if your classes are imbalanced, meaning that the number of positive outcomes is much smaller than the number of negative outcomes or vice versa, you might calculate a very high value for accuracy, but still miss many of those infrequent positive outcomes, for instance, if you're looking at a rare disease that has a natural base rate of say, two positive cases for every 10,000 people in your population, you'll need to take a different approach to evaluating your model. You may want to oversample your positive cases or under-sample your negative cases, so you have a better balance between positives and negatives. You can then calculate the accuracy and other ratios for your model.