Creating a new df for categorical variables

Hello there,
For Q6. To encode all the categorical variables, I have two sub questions:

  1. Is it possible to filter ordinal and nominal categories? or do I need to check each column manually and then perform encoding accordingly?
  2. Can we use one hot encoding on several columns at the same time?

Thanks!

Hello @agarwalamit081.

About your first question the answer is yes, you can use a pipeline to give different preprocessing to different columns. Notice that you would still need some expert knowledge to decide which encoding is meaningful or necessary in a given real life application.

About the second question the answer is also yes. Indeed this is what Q6 is asking for. Remember that make_column_transformer will come handy to define which columns require which preprocessing.

1 Like