Mean_absolute_percentage_error and TransformedTargetRegressor

efenaux · 5 April 2022 07:48

Hi,

At the end of this notebook I computed mean_absolute_percentage_error with the predicted values obtained with the modified model. It decreases from 13.58% with original model to 9.92 % with the transformed model

Then I made the same comparison with Hist GradientBoostingRegressor and the value decreased form 9.77% with original model to 9.57% with modified model (respectivly 9.88% and 9.41% with sample_weight = 1 / target_train)

Is there some theoretical reason for this trend (i.e. reduction of mean_absolute_percentage_error) when transforming the target data ?

I understand this is probably a question beyond the scope of this mooc but it is not usual for me to get access to such a skilled team

thanks

ArturoAmorQ · 7 April 2022 09:55

If you cross-validate your results

from sklearn.ensemble import HistGradientBoostingRegressor

hgbd = HistGradientBoostingRegressor()
model_transformed_target = TransformedTargetRegressor(
    regressor=hgbd, transformer=transformer)
cv_results = cross_validate(
    model_transformed_target, data, target,
    scoring='neg_mean_absolute_percentage_error')

The output is:

hgbd without transform
10.02 +/- 0.49
hgbd with transform
9.71 +/- 0.64

you will notice that the model using the QuantileTransformer and not using it have an equivalent generalization performance (scores overlap a lot). The reason is that the HistGradientBoostingRegressor already does an inner quantile binning that is an idempotent operation, but small variations may arise when sorting the data.

ogrisel · 8 April 2022 13:06

Actually HistGradientBoostingRegressor does quantile binning only on the input features, not on the target.

However, indeed I don’t think that a GBRT model would be too sensitive to monotonic transformation of the output as decision trees are typically flexible enough to be able to approximate functions of any shapes as long as the number of trees and their depth are large enough to avoid under-fitting.

efenaux · 9 April 2022 10:22

Many thanks to the both of you for this discussion. I made the test with a GradientBoostingRegressor and this is right, there is no real influence when transforming the target.