Scoring_Parameter

When we have imbalanced data, which scoring metric we have to put in the randomisedsearchcv : ‘roc_auc’ or ‘f1’?

F1 is a bit of an arbitrary score that averages the precision and recall. I would not recommend using it.

I would recommend three scores to look at when looking at imbalanced classification problems:

  • ROC-AUC
  • Average precision (area under the precision-recall curve)
  • Matthews correlation coefficient

I would not say that there is a better metric than another among these metrics because it would depend on your application at hand: these metrics will differ from each other if they are considering the “negative labels”.

These scores are based on thresholded scores (output of model.predict). Sometimes, someone is interested in the probabilities of a classifier (i.e. model.predict_proba) instead and in this case, the Brier score would be an interesting metric too look at.