As mentioned in the notebook " Cross-validation framework" in the “Overfitting and Underfitting” section of Module 2:
a single train-test split we don’t give any indication regarding the robustness of the evaluation of our predictive model: in particular, if the test set is small, this estimate of the testing error will be unstable and wouldn’t reflect the “true error rate” we would have observed with the same model on an unlimited amount of test data.
In that sense it is better to use cross-validation: either with KFold
strategy, ShuffleSplit
or other strategies that will be covered more in detail in Module 7.
When to use a given strategy depends on the dataset and the user case, for instance, KFold
may suffice to estimate the generalization performance of a model, but ShuffleSplit
may provide more information about how the scores are distributed across folds and will be more robust to dataset ordering, as will be covered in Module 7.