In the last question you don’t get too much details about categorical data preparation. I think you should be more explicit . I used a different strategy for categorical data imputation and got different results.
Sorry if I’m wrong.
Thank you!
I have the same problem. I used OneHotEncoder with (handle_unknown = “ignore”) and SimpleImputer with default settings. As a result, ‘test_score’ is different. But is this my fault?
You are right, we should be more explicit for the categorical pipeline mentioning that you should use a SimpleImputer(strategy="constant", fill_value="missing")
.
Got the same error because I used
categorical_processor = make_pipeline(
SimpleImputer(strategy=“most_frequent”),
OneHotEncoder(handle_unknown=“ignore”),
)
You could me more explicit and could you tell me why strategy=“most_frequent” is not good ?