Concept:
A Confusion Matrix is a performance evaluation tool for classification models. It summarizes prediction results by comparing actual labels with predicted labels, helping analyze model strengths and weaknesses.
Step 1: {\color{red}Structure of a Confusion Matrix}
A binary classification confusion matrix contains four components:
- True Positive (TP): Correctly predicted positive cases
- True Negative (TN): Correctly predicted negative cases
- False Positive (FP): Incorrectly predicted positive (Type I error)
- False Negative (FN): Incorrectly predicted negative (Type II error)
Step 2: {\color{red}Accuracy}
Accuracy measures overall correctness of the model:
\[
\text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}
\]
It shows the proportion of total correct predictions.
Step 3: {\color{red}Precision}
Precision measures prediction reliability for positive class:
\[
\text{Precision} = \frac{TP}{TP + FP}
\]
It answers: \emph{Out of predicted positives, how many were correct?}
Step 4: {\color{red}Recall (Sensitivity)}
Recall measures the model’s ability to detect actual positives:
\[
\text{Recall} = \frac{TP}{TP + FN}
\]
It answers: \emph{Out of actual positives, how many were correctly identified?}
Step 5: {\color{red}Why These Metrics Matter}
- Accuracy works well with balanced datasets
- Precision is important when false positives are costly
- Recall is critical when missing positives is dangerous (e.g., medical diagnosis)