Wrap Up Quiz

teorems · 27 June 2021 10:19

Hello.

Why should we use KFold CV when comparing the performances of the random forest model vs. the gradient boosting one? I believe Kfold was also used in another occasion before, and i did non understand why. Is it not the basic strategy implemented in cross_validate?

The last question concerning the BalancedBaggingClassifier is not clear, you do not state that the base_estimator should be HistGradient. The default estimator is a decision tree, so I got obviously a worse result.

glemaitre58 · 27 June 2021 15:03

It depends: KFold is the default in cross_validate if the model is a regressor otherwise it will be a StratifiedKFold if the model is a classifier.

True, we could be more explicit to mention that the base estimator is the same gbdt.

teorems · 27 June 2021 15:06

Ok, thanks! And so in which cases do we have to use an ad hoc cross validation instead of the default one?

glemaitre58 · 27 June 2021 15:37

It is the topic of module 7 → Choice of cross-validation

lesteve · 28 January 2022 16:23

The ensemble wrap-up has been reworked (different dataset)