Wrap Up Quiz

Hello.

Why should we use KFold CV when comparing the performances of the random forest model vs. the gradient boosting one? I believe Kfold was also used in another occasion before, and i did non understand why. Is it not the basic strategy implemented in cross_validate?

The last question concerning the BalancedBaggingClassifier is not clear, you do not state that the base_estimator should be HistGradient. The default estimator is a decision tree, so I got obviously a worse result.

It depends: KFold is the default in cross_validate if the model is a regressor otherwise it will be a StratifiedKFold if the model is a classifier.

True, we could be more explicit to mention that the base estimator is the same gbdt.

Ok, thanks! And so in which cases do we have to use an ad hoc cross validation instead of the default one?

It is the topic of module 7 → Choice of cross-validation :slight_smile:

1 Like

The ensemble wrap-up has been reworked (different dataset)