Hi,
In Question 6, I think we should only consider the numerical features in the list named “numerical_features” given before Question 5.
What could be a way to use the make_column_selector function and only obtain the subset listed in an specific list?
What I ended doing was to drop the other numeric columns from the dataset so when I used the function make_column_selector like this:
numerical_columns_selector = selector(dtype_exclude=object)
numerical_columns = numerical_columns_selector(data)
I only obtained the ones I didn’t drop, but I guess there should be a better way to do it.
From a stackoverflow post (python - How to select only few columns in scikit learn column selector pipeline? - Stack Overflow) I tried this but it didn’t work
preprocessor = ColumnTransformer([(‘one-hot-encoder’, categorical_preprocessor, categorical_columns), (‘standard_scaler’, “passthrough”, numerical_features)])
Thanks in advance!