Tool for HR, Hiring Managers, and the Leadership Team

What is Linear Regression?

What is Linear Regression? 

Linear Regression is a Supervised Machine Learning algorithm used to predict a continuous numeric value based on one or more input features.

It tries to find the best-fit straight line between input variables and output values.

Simple Definition

Linear Regression predicts a value by finding the relationship between input and output variables using a straight line.

Examples:

  • Predicting house prices

  • Predicting salary based on experience

  • Predicting sales revenue

  • Predicting temperature

Real-World Example

Suppose we want to predict a person's salary based on their years of experience.

Experience (Years) Salary ($)
1 30,000
2 35,000
3 40,000
4 45,000
5 50,000

We can observe:

  • More experience → higher salary

  • Relationship looks almost like a straight line

Linear Regression finds the best line representing this pattern.

Linear Regression Equation

The equation is:

y = mx + b

Where:

  • y = predicted output

  • x = input feature

  • m = slope of the line

  • b = intercept

Applying It to the Example

Suppose the model learns:

Salary = 5000 \times Experience + 25000

If experience = 6 years:

Prediction:

Salary = 5000(6) + 25000 = 55000

Predicted salary = $55,000

How Linear Regression Works

Steps:

  1. Collect data

  2. Plot data points

  3. Find the best-fit line

  4. Minimize prediction error

  5. Use the line to predict future values

The algorithm tries to minimize the difference between:

  • Actual value

  • Predicted value

This error is often measured using:

  • Mean Squared Error (MSE)

Types of Linear Regression

1. Simple Linear Regression

Uses:

  • One input variable

  • One output variable

Example:

  • Experience → Salary

2. Multiple Linear Regression

Uses:

  • Multiple input variables

Example:

  • House Size

  • Number of Rooms

  • Location

to predict:

  • House Price

Equation:

y = b_0 + b_1x_1 + b_2x_2 + ... + b_nx_n

Important Interview Terms

1. Dependent Variable

The output being predicted.

Example:

  • Salary

2. Independent Variable

Input feature used for prediction.

Example:

  • Experience

3. Best-Fit Line

The line that minimizes total prediction error.

4. Residual/Error

Difference between actual and predicted value.

Formula:

Residual = Actual - Predicted

Assumptions of Linear Regression

Interviewers often ask this.

Linear Regression assumes:

  1. Linear relationship between variables

  2. No high multicollinearity

  3. Errors are normally distributed

  4. Constant variance of errors

  5. Independent observations

Advantages

  • Simple and easy to understand

  • Fast to train

  • Works well for linear data

  • Highly interpretable

Limitations

  • Works poorly with non-linear data

  • Sensitive to outliers

  • Assumes linear relationship

  • Can underperform on complex datasets

Common Interview Questions

Q1: Is Linear Regression used for classification?

No.

Linear Regression is used for:

  • Continuous numeric prediction

For classification, we usually use:

  • Logistic Regression

Q2: How do you evaluate Linear Regression?

Common metrics:

  • R² Score

  • MAE

  • MSE

  • RMSE


Q3: What is overfitting?

When the model learns noise instead of actual patterns and performs poorly on new data.

Short Interview Answer

Linear Regression is a supervised learning algorithm used to predict continuous numeric values by finding the best-fit linear relationship between input and output variables. It works using the equation y = mx + b and is commonly used for predictions like salary, sales, and house prices.

Easy Way to Remember

Think:

“Linear Regression draws the best straight line to predict numbers.”