Hi Teachers,
I’ve a few questions about Pipeline/make_pipeline, I’ve understood your explanations, however, we got one step in each type of variable, like the StandardScaler for nums, and categorical encoding, if I have more than one step as: (num variable) imputer + then standard scaler ?
Using pipeline seems to more more explicity to create steps, with make pipeline I’ve tried without success.
Another question relate to is, is it possible concataned a gridsearch cv, inside the pipeline ?
Here, you specify to first apply the scaler, then the imputer and finally train/predict with a logistic regression.
The difference between Pipeline and make_pipeline is that you can decide the name of the element in the pipeline with Pipeline while it will be automatically assigned with make_pipeline (it will use the name of the class). So I used make_pipeline above. The equivalent with Pipeline is:
Hi glemaitre58,
Thanks for the reply, I got your explanation, still, there is one question remain to me, the steps you described can be applied to numeric processing and categoric processing ? We did some exercises where we split the “steps” for each type of variable, in the way you explain, I can apply the same idea ???
Thanks for again for your time. I’ve been enjoying this MOOC very much, the majority explanations are clear and straightforward. Is it to much ask for a exemple about make_pipeline with several steps with numeric and categorical features?
Thanks glemaitre58,
After I read carefully your reply I got my previous mistake. Sorry for the English in the questions related to nested cross-validation, I was referring to inner and outer steps, but you already answered my questions. I should choose a inner and outer “process” with caution, in the same way, you guys explained to us in this module.