In relation to Exercise M3.01, would not it be better to not split the data between train/test and rather do cross validation on entire dataset?
This will utilize more samples/observations for validation compared to only validation on train dataset.
Any thoughts?