Polynomial Feature Transformer with classification

Would it be a good idea to use the PolynomialFeature stransformer to augment the Feature Space for general purpose modelling (e.g: classification), or is it something that should only be applied in regression and only in conjunction with linear models?

Additionally, in terms of preprocessing, should any preprocessing (e.g. normalization) take place prior to the transformer or after?

Thanks

Would it be a good idea to use the PolynomialFeature stransformer to augment the Feature Space for general purpose modelling (e.g: classification), or is it something that should only be applied in regression and only in conjunction with linear models?

It can be used for general purposes as long as features are expected to interact. For instance, think of a classification model to decide if a patient has risk of developing a heart disease. This would depend on your Body Mass Index which is defined as weight / height² (assuming one of the features is 1/height, though). Then BMI is an informative feature regardless if the model is linear or tree based.

Additionally, in terms of preprocessing, should any preprocessing (e.g. normalization) take place prior to the transformer or after?

From a computational point of view I would say that it’s better to first create the interacting features and then normalize them. In that sense you avoid high powers (order x² or higher) to spread data and potentially harm convergence.

From an interpretability point of view, it feels more natural to create the interactions in it’s original units and then scale them, but scaling and then transforming can give more importance to certain features that otherwise could be overlooked at in their original scale. This would be good for the model if said features are really predictive, as mentioned in the solution of Exercise M3.02.

1 Like