Question 8_

Hi,

in the official documentation

https://imbalanced-learn.org/stable/references/generated/imblearn.ensemble.BalancedBaggingClassifier.html

is written the following:

image

Based on that I understood that:

  • firstly the complete dataset is being resampled and
  • afterwards the bootstrap-samples are being generated

And as a result, I didn’t select the right answer in the QUIZ.

To be honest, the right answer in the quiz makes more sense.

Did I wrongly understand the documentation or the documentation is a bit missleading?

Thanks

Miguel

Yes, the answer is not accurate enough. Your understanding is correct and the resulting will be almost what is described in the answer. We should correct the answer. Thanks for reporting.

We should indeed rephrase the question and the answer. The fact that BalancedBaggingClassifier does 2 nested resamplings is a technical detail caused by the complex inheritance structure of that class. In an ideal world there could be only a single resampling occurring. Both for balancing purposes and for bagging purposes (randomness needed to combat overfitting by averaging predictions).

What is important it that each base estimator is trained an independently on a resampled training set with approximately balanced classes by undersampling the classes that are naturally over-represented in the original dataset.

The ensemble wrap-up has been reworked (different dataset)

The ensemble wrap-up has been reworked (different dataset)