Caution ... and after

funBruneau · 7 March 2022 12:16

Hello,
I don’t understand this warning:

“Caution!
Be aware that we use train_test_split here for didactic purposes, to show the scikit-learn API.”

In “real life” what would you have used? because in the processing the result of this didactic purpose is really used!

glemaitre58 · 7 March 2022 12:51

In really like, one should use cross-validation as previously demonstrated. Without cross-validation, you will not be aware of the uncertainty of the statistical performance of a model.

ogrisel · 7 March 2022 13:39

the uncertainty of your estimation of the statistical performance of a model.

funBruneau · 7 March 2022 17:08

Thank you, the link with the cross-validation had escaped me.

lesteve · 9 March 2022 17:09

I am tagging this for v3 since this we should probably mention that cross-validation may be preferred

ArturoAmorQ · 23 May 2022 09:15

Addressed in Improve description in caution message by ArturoAmorQ · Pull Request #633 · INRIA/scikit-learn-mooc · GitHub