Categorical Features

SandraOriji · 19 April 2022 21:31

For categorical features, it is generally common to omit scaling when features are encoded with a OneHotEncoder since the feature values are already on a similar scale.
Does this mean that when we use Onehotencoder to encode the categorical features that there is no need for scaling the categorical column again?

I have been thinking how that is possible because after encoding I have to work with the whole data (categorical and numerical features) to build my model of which i need to scale my numerical column.

Now how then can separate the categorical features from not been scaled since i am using all my data.

glemaitre58 · 20 April 2022 09:24

Yes, you don’t need to apply an additional scaling after encoding.

You can check the next notebook that will show how to use a ColumnTransformer to apply different preprocessing on different columns of the original data.

glemaitre58 · 20 April 2022 09:25

Actually, I see that you are in Module 4. The ColumnTransformer was presented in Module 1: Using numerical and categorical variables together — Scikit-learn course

SandraOriji · 21 April 2022 10:24

This documentation is so so clear. Thank You so much @glemaitre58