In machine learning interviews, classification vs regression is one of the most frequently asked fundamental questions. The key difference lies in the type of output they predict.
1. Classification
Definition:
Classification is a supervised learning technique where the model predicts a category or class label.
Output Type:
-
Discrete values (labels/classes)
-
Example: Yes/No, Spam/Not Spam, Cat/Dog
Examples:
-
Email → Spam or Not Spam
-
Medical diagnosis → Disease present or not
-
Image recognition → Cat, Dog, Bird
Algorithms used:
-
Logistic Regression
-
Decision Trees
-
Random Forest
-
SVM (Support Vector Machine)
-
Neural Networks
Evaluation Metrics:
-
Accuracy
-
Precision, Recall, F1-score
-
ROC-AUC
2. Regression
Definition:
Regression is a supervised learning technique where the model predicts a continuous numerical value.
Output Type:
-
Continuous values
-
Example: Price, Temperature, Salary
Examples:
-
House price prediction → ₹50 lakhs
-
Weather forecasting → 32.5°C
-
Salary prediction → $80,000/year
Algorithms used:
-
Linear Regression
-
Polynomial Regression
-
Decision Tree Regressor
-
Random Forest Regressor
-
Gradient Boosting Regressor
Evaluation Metrics:
-
Mean Absolute Error (MAE)
-
Mean Squared Error (MSE)
-
Root Mean Squared Error (RMSE)
-
R² Score
Key Differences (Interview Table)
| Feature | Classification | Regression |
|---|---|---|
| Output Type | Categorical (labels) | Continuous (numbers) |
| Goal | Assign class | Predict value |
| Example | Spam detection | House price prediction |
| Algorithms | Logistic Regression, SVM | Linear Regression, RF Regressor |
| Evaluation | Accuracy, F1-score | RMSE, MAE, R² |
Interview Tip (Very Important)
A common trick question:
“Is Logistic Regression used for classification or regression?”
✔ Answer: Classification
Even though the name says “regression,” it is used for binary classification problems because it outputs probabilities using the sigmoid function.
Simple Memory Trick
-
Classification = Class labels (C for Category)
-
Regression = Real numbers (R for Range/Real value)
