Exercise M2.01

aigle81 · 29 May 2021 16:03

To draw conclusions if the model overfit or / and underfit in videos and notebooks of this module, the demonstration is made on the comparison of the evolution of the train set error and that of the test error (it is necessary of course , choose the error metric on which we will base ourselves for the demonstration). But here, we have a classification problem, and I am a little surprised that the proposed solution is based on the comparison of the “accuracy” of the predictions of the train and test set. will the logic be the same?
I kindly draw your attention that the legend of two curves of the solution seems strange to me, I suppose that “trainning error” and “testing error” must change to “training error” and “testing error”.

glemaitre58 · 29 May 2021 18:52

Indeed, we just compare a metric between train and test sets. For regression, we can use one of MAE, MSE, R2 and for classification, we can use accuracy or any other classification metric. The evolution of the metric is usually the same and does not depend on the metric.

I don’t see the typo in the last version of the notebooks: 📃 Solution for Exercise M2.01 — Scikit-learn course

Could you confirm that we refer to the same figures.

aigle81 · 30 May 2021 08:38

Thank you for clarifications for the metrics
Since we use accuraccy as a metric, the legend of two figures will be “trainning accuracy” and “testing accuraccy”.
I misspoke in the previous post by writing “testing error” and “trainning error”

glemaitre58 · 30 May 2021 08:55

Oh right. Thanks for pointing this out. We need to change it.

glemaitre58 · 30 May 2021 08:58

Solved in FIX change error by score since we use accuracy · INRIA/scikit-learn-mooc@1aa1809 · GitHub

The changes will appear after synchronizing the notebooks in FUN.