Cross Validation on whole dataset vs train/test split

In relation to Exercise M3.01, would not it be better to not split the data between train/test and rather do cross validation on entire dataset?

This will utilize more samples/observations for validation compared to only validation on train dataset.

Any thoughts?

1 Like

In practice, you will do nested cross-validation but it is introduced in the next notebook. This first exercise want just to give some intuitions regarding hyperparameter tuning and evaluation.