Wrap-up quiz 1, Question 6, the meaning of "fold"

Hi,
I didn’t get (understand) the “fold” thingy.
Could someone please explain it to me in simple words and one or two examples.
Thanks in advance, T.G.

A fold is a group of samples. In this sense, KFold divides all the samples in K folds (in scikit-learn notation K is controlled with the parameter n_splits). This strategy uses K-1 folds for training and 1 for testing.

In the image below there is an example of 5-fold cross-validation. Green groups of samples are used for training and blue for testing. Gray denotes the dataset splitting before training or testing.

5-fold cross validation

For more info see the “Validation of a Model” video.

2 Likes

@ArturoAmorQ, thanks for your rapid answer,
So, if I write: cv_results_num = cross_validate(model, data_numerical, target, cv=7) then cv=7 means 7 folds, right?

So, if I write: cv_results_num = cross_validate(model, data_numerical, target, cv=7) then cv=7 means 7 folds, right?

You are right. The default cross-validation strategy used by the cross_validate function is KFold, where the integer you pass denotes the number of folds.

1 Like