What is bias vs variance?

Bias vs Variance is one of the most important concepts in Machine Learning interviews because it explains why models fail and how to improve them.

Bias (Underfitting problem)

Definition:
Bias is the error due to overly simplistic assumptions in the learning algorithm.

In simple terms:
The model is too simple to capture the patterns in data.

Example:

Using a linear regression model for a highly complex, non-linear dataset.
The model ignores important patterns.

Behavior:

High training error
High test error
Poor performance on both

Interview phrase:

“High bias leads to underfitting.”

Variance (Overfitting problem)

Definition:
Variance is the error due to the model being too sensitive to training data fluctuations.

In simple terms:
The model learns the training data too well, including noise.

Example:

A very deep decision tree that memorizes training data.
Performs well on training data but fails on unseen data.

Behavior:

Very low training error
High test error

Interview phrase:

“High variance leads to overfitting.”

Bias vs Variance Trade-off

The goal is to find a balance between simplicity and complexity.

Aspect	High Bias (Underfitting)	High Variance (Overfitting)
Model	Too simple	Too complex
Training error	High	Low
Test error	High	High
Problem	Misses patterns	Learns noise

Intuition (Very important for interviews)

Think of a target board :

High bias: All shots are far from the center (consistent but wrong)
High variance: Shots are scattered all over (inconsistent)
Good model: Shots are close to the center and close to each other

How to fix them (common interview follow-up)

To reduce Bias:

Use more complex models
Add more features
Reduce regularization
Train longer (for some models)

To reduce Variance:

Get more training data
Use regularization (L1/L2)
Reduce model complexity
Use techniques like cross-validation
Use ensemble methods (bagging, Random Forest)

One-line interview answer

“Bias is error due to overly simple assumptions causing underfitting, while variance is error due to sensitivity to training data causing overfitting. The challenge is to balance both to achieve good generalization.”