Does really High variance == overfitting?

peguerosdc · 4 June 2021 19:10

I was checking the wikipedia entry referenced at the end of the slides that says:

“It is an often made fallacy to assume that complex models must have high variance; High variance models are ‘complex’ in some sense, but the reverse needs not be true”

Which seems to contradict the statement I am putting in the title that I am actually taking from the slides (in the “Take home message”):

High variance == overfitting

I am assuming (based only on the previous lectures) that overfitting==high complexity in the model (i.e. a high order polynomial).

Is my assumption wrong (I am no data science expert, so that’s likely) or am I missing something else in here?

aniketjadhav · 5 June 2021 07:11

I’m thinking that you have put in too much thought in the word ‘complex’ given this context.

Overfitting would simply mean that the model not good at generalizing the data. Complexity(order of poly) could be one of the reasons for the overfitting.

Again, I’m not scientist here.

lesteve · 7 June 2021 16:41

The wikipedia page sentence is here. If you look at the references (for example Bias–variance tradeoff - Wikipedia), it seems like this sentence has an implicit deep learning context, basically that deep learning models have plenty of parameters and yet still generalise well.

Pragmatic answer:

overfitting == high variance == too complex is a good enough approximation of the truth
this is an introductory course into machine learning, so covering these aspects is much too advanced for this course

Full disclosure: not a deep-learning expert myself, so I don’t really know the details behind this sentence …