What is the difference between using categorical_columns_selector vs categorical_columns in ColumnTransformer. Prior to this notebook, it selector inside ColumnTransformer. Please explain
categorical_columns was a list of names of the categorical columns to consider.
categorical_columns_selector is one level up to where you only define, based on the data type, which column to consider.
categorical_columns_selector = make_columns_selector(
dtype_include="object"
)
The line above will define a rule that will select the columns of "object" dtype. To actually select the name of the columns and get a list of the selected columns, you need to be calling the selector with a dataframe:
categorical_columns = categorical_columns_selector(X)
ColumnsTransformer accepts both a list of values or a callable that returns a list of values. categorical_columns corresponds directly to the list of values while categorical_columns_selector corresponds to a callable that will return the list of values when passing the dataframe.
Got it so ColumnTransformer can accept both categorical_columns_seelctor as categorical_columns
Right?
Yes this it it.