Bias ** 2

Can you elaborate why when we do error decomposition it is bias squared:
MSE = Bias ** 2 + Var + Irreducible Error
Beyond the fact that is comes just from math. Is there any ‘physical meaning’ why it is squared? Can you explain it in simple words, please?
It’ll be much appreciated.

You can find the proof on the following wiki page (in the Sect. “Estimator”):

Without the square, you don’t have the equivalence between the MSE and the decomposition in terms of bias-variance.

1 Like