Bias, Variance Trade-off in Machine Learning?

bias variance

bias variance
bias variance


Bias is an error term when building a learning algorithm model from a faulty assumptions. If bias is too large then the algorithm won’t able to model the relationship between input features(X) and target(y) variable output.

“The error due to bias is taken as the difference between the average prediction of our model and the correct value which we are trying to predict.”  By scot fortman

Effect of High Bias – Assume the model is not complex enough then misses out specific features or dynamics of data. Low complex models draws a straight line to fit the data points which forms high bias and low variance so the predications are in general, far from the correct values.

The scenario of high bias leads to ‘Under fitting’ the model.


Variance is an error term when the result of the model is executed with unseen data or new data. Then the model will be very specific to training set, gives more error in training data.

Effect of High Variance – Assume the model is highly complex draws high-degree polynomial line to fit the data points which forms a high variance and low bias so the distance between predictions and correct values are very small.

The scenario of high variance leads to ‘Over fitting’ the model.

“The error due to variance is taken as the variability of a model prediction for a given data point.”  By Scot Fortman

Different levels of Bias and variance combination

The above figures shows variation in data points

Bias, Variance trade-off

Bias-Variance off is to get an optimal point for the model complexity. We can reduce either bias or variance but can’t reduce both bias and variance simultaneously. This can be done by modifying MSE (Mean squared error).

Why Trade-off?

  • To minimize the error and get maximum accuracy from the model.
  • To avoid over fitting and under fitting.
  • To have consistencies in prediction.

How to overcome Bias and Variance problem?

  • Training & Testing data
  • Cross validation
  • Dimensionality Reduction
  • Regularization in Linear models/ANN
  • Concept of over fitting
  • Ensemble Learning
  • Optimal value of k in KNN