A confusion matrix is an important tool used while training a machine learning classification model on a set of data. The confusion matrix shows how accurate the model was at categorizing each record and where errors may be occurring. It is a way to measure performance for a machine learning model that has the goal of classifying two or more different outputs.
The matrix itself is a table with four different combinations of actual and predicted values. The rows in the matrix show the actual labels from the dataset, while the columns show the outcomes from the model. In the matrix, actual values are described as true and false, while predicted values are expressed as positive and negative.
A true positive occurs when the model predicts a positive value, and it is true, while a true negative occurs when the model predicts a negative value, and it is true. For example, it would be a true positive if a model predicts a person is a female and she is, and a true negative occurs if the model predicts a person is not a female and he is not.
On the other hand, a false positive occurs when the model predicts a positive value, but it is false – this is a Type I error. The same logic holds for a false negative, where the model predicts a negative value, and it is false – this is a Type II error.
Confusion matrices are essential for measuring how accurately a model is predicting outcomes, as well as when it is repeatedly confusing classes. This analysis can allow users to determine what is working correctly and where additional training may be required.
The confusion matrix can help a company decide when a machine learning model is ready to be used to make business decisions. In testing, there are minimal consequences of a false positive or a false negative, but once in use, an incorrect decision can cause serious issues for a firm. If there are too many false positives or negatives, they may need to consider a different model, adding in additional data, or feature engineering to improve results.
At LogicPlum, our automated data science workflow provides your business with everything that you need to build, deploy, and maintain machine learning models within your organization. Our team of experts will help you analyze the automatically-generated confusion matrices to help you find the right level of accuracy for your unique business needs.
© 2020 LogicPlum, Inc. All rights reserved.