Split/folds

" More detail regarding cross_validate
…To make it explicit, it is possible to retrieve these fitted models for each of the splits/folds by passing the option return_estimator=True in cross_validate ."

Where do ‘folds’ suddenly appear from? It seems that they are a way to do CV but there not mentioned anywhere before this. And as far as I can tell CV is using ShuffleSplit(n_splits=40... so where do k-folds cone in?

The concept of “fold” appears for the first time in the notebook Model evaluation using cross-validation, where it is used as a synonym for “split” or “partition” and denotes the hyperparameter n_splits in scikit-learn notation for cross-validation, regardless of the particular strategy used i.e. KFold(n_splits=40), ShuffleSplit(n_splits=40), etc.

Does this answer the question?

Ah, so [‘splits’, ‘folds’, ‘partitions’] are just synonyms, specified always with n_splits=.

Thank you

Yes. For further reference, this is also discussed in this forum post.

I tagged this as priority-mooc-v4, I think we should try to use splits consistently rather than sometimes folds sometimes splits.