What is the difference between using categorical_columns_selector vs categorical_columns in ColumnTransformer. Prior to this notebook, it selector inside ColumnTransformer. Please explain
categorical_columns
was a list of names of the categorical columns to consider.
categorical_columns_selector
is one level up to where you only define, based on the data type, which column to consider.
categorical_columns_selector = make_columns_selector(
dtype_include="object"
)
The line above will define a rule that will select the columns of "object"
dtype. To actually select the name of the columns and get a list of the selected columns, you need to be calling the selector with a dataframe:
categorical_columns = categorical_columns_selector(X)
ColumnsTransformer
accepts both a list of values or a callable that returns a list of values. categorical_columns
corresponds directly to the list of values while categorical_columns_selector
corresponds to a callable that will return the list of values when passing the dataframe.
Got it so ColumnTransformer can accept both categorical_columns_seelctor as categorical_columns
Right?
Yes this it it.