Indeed, at the moment the threshold of probabilistic classifiers is not tunable in scikit-learn and hard-corded to 0.5.
We might change that in a future version. In any case you can could define you could change this in a subclass of your own:
class CustomThresholdClassifier(BaseClassifier):
def __init__(self, proba_threshold=0.5, **other_params):
super().__init__(**other_params):
self.proba_threshold = proba_threshold
def predict(self, X):
return (super().predict_proba(X) > self.proba_threshold).argmax(axis=1)
Note the code above should only works for binary classification (and I haven’t actually tested it).
You could also write a generic meta-estimator that would work for any base classifier instance passed as a base_estimator
attribute using a wrapping/composition logic instead of subclassing.
See also this pull-request in scikit-learn where we discussed how to implement generic tools to tuning this: https://github.com/scikit-learn/scikit-learn/pull/10117