Hyperparameter tuning & CV

GiorgiKvinikadze · 22 February 2022 17:13

If we run cross-validation and hyperparameter tuning at the same times, does it violation of generalization or not?

ArturoAmorQ · 23 February 2022 09:18

This is answered in the Evaluation and hyperparameter turning notebook in the “Automated tuning” module:

One important caveat here concerns the evaluation of the generalization performance. Indeed, the mean and standard deviation of the scores computed by the cross-validation in the grid-search are potentially not good estimates of the generalization performance we would obtain by refitting a model with the best combination of hyper-parameter values on the full dataset. […] We therefore used knowledge from the full dataset to both decide our model’s hyper-parameters and to train the refitted model.
Because of the above, one must keep an external, held-out test set for the final evaluation the refitted model.

lesteve · 23 February 2022 15:11

Pour info, je t’ai édité ton message en mettant un lien vers le notebook en question, comme ça on encourage nos utilisateurs à adopter des bonnes pratiques