Addressing class imbalance

Is there any way to handle issue of class imbalance when using histGradient boosting. A problem I have has 97% class 1 and 3% class 0. Histgradient gives accuracy 97% even when using cross validation to select best_params using randomized search and cross validation for evaluation - but thats no better than dummy classifier with “most frequent” strategy!??

thank you

It is an enormous topic. Maybe you can take a look at imbalanced-learn, which is compatible with all the scikit-learn conventions.

Thanks Arturo, I’ll have a look. Not a big fan of undersampling and oversampling though,
but I really like how class_weight in Logistic Regression helps you penalize
misclassification and thereby improve accuracy for both classes.

Appears HistGradientBoostingClassifier does not have a class weights
like parameter to handle class imbalance.
regards
vin

Hi vinorda

In this situation, I thing that using a one class classifier will be a good shoice.

Good luck.