In Machine Learning, Precision, Recall, and F1-score are evaluation metrics mainly used for classification problems, especially when the dataset is imbalanced.
These are very common interview questions for AI/ML roles.
1. Precision
Precision answers:
“Out of all the items the model predicted as positive, how many were actually positive?”
Formula
Where:
-
TP = True Positives
-
FP = False Positives
Simple Example
Suppose a model predicts whether an email is spam.
-
Model predicted 20 emails as spam
-
Out of those, 15 were actually spam
So:
-
TP = 15
-
FP = 5
Precision:
Precision = 75%
Interview-Friendly Meaning
High precision means:
-
When the model says “Yes”, it is usually correct.
-
Low false positives.
Real-World Use Cases
Precision is important when false positives are costly.
Examples:
-
Spam detection
-
Fraud detection
-
Important email filtering
Example:
If a legitimate banking transaction is marked as fraud incorrectly, it creates problems.
2. Recall
Recall answers:
“Out of all actual positive cases, how many did the model correctly identify?”
Formula
Where:
-
FN = False Negatives
Example
Suppose:
-
There are actually 30 spam emails
-
Model correctly identified 15
So:
-
TP = 15
-
FN = 15
Recall:
Recall = 50%
Interview-Friendly Meaning
High recall means:
-
The model captures most positive cases.
-
Low false negatives.
Real-World Use Cases
Recall is important when missing a positive case is dangerous.
Examples:
-
Cancer detection
-
Disease diagnosis
-
Intrusion detection
Example:
Missing a cancer patient is far worse than giving an extra warning.
3. F1-Score
F1-score combines Precision and Recall into a single metric.
It is the harmonic mean of precision and recall.
Formula
F1 = 2 \times \frac{Precision \times Recall}{Precision + Recall}
Example
If:
-
Precision = 0.75
-
Recall = 0.50
Then:
F1-score = 60%
Why Use F1-Score?
Accuracy can be misleading for imbalanced datasets.
Example:
-
99 normal transactions
-
1 fraud transaction
If the model predicts everything as normal:
-
Accuracy = 99%
-
But fraud detection failed completely.
F1-score gives a better balance between precision and recall.
Quick Comparison Table
| Metric | Focus | Important When |
|---|---|---|
| Precision | Correct positive predictions | False positives are costly |
| Recall | Finding all positives | False negatives are costly |
| F1-score | Balance of precision & recall | Need balanced performance |
Easy Interview Memory Trick
-
Precision → “How precise are positive predictions?”
-
Recall → “How many actual positives did we recall/find?”
-
F1-score → “Balanced score between both”
Common Interview Question
Q: Which is more important — Precision or Recall?
Answer:
It depends on the business problem.
-
Use Precision when false positives are expensive.
-
Example: Spam filtering
-
-
Use Recall when false negatives are dangerous.
-
Example: Disease detection
-
-
Use F1-score when both matter.
Confusion Matrix Connection
These metrics come from the confusion matrix:
| Actual / Predicted | Positive | Negative |
|---|---|---|
| Positive | TP | FN |
| Negative | FP | TN |
Short Interview Answer
Precision measures how many predicted positives are actually correct.
Recall measures how many actual positives the model successfully identified.
F1-score is the harmonic mean of precision and recall and is useful when we need a balance between them, especially for imbalanced datasets.
