Maybe an example of nested pipeline explained before will be help to solve last quiz question

andradgu · 28 April 2021 20:38

I spend lot of time to understand how to chain all transformers in a ColumnTransformer, but finally that come easy when you understand that it is possible to use a pipeline as inside a ColumnTransfomer as a nested pipeline. Maybe this notion can be help to easy answer last quiz question. Of course, maybe there are others simple methods to do that but I haven’t found

lesteve · 29 April 2021 13:59

Just a side-comment (not sure I fully understood your post) : you can compare to the solution of the wrap-up quiz which should give you some code to answer your question.

If you think we can improve something in the answer or in the previous notebook, let us know!

andradgu · 2 May 2021 21:13

This is exactly the solution I found. This solution implies :

a pipeline inside a columm_transformer
a columm_transformer inside another pipeline

actually => a pipeline inside another pipeline : nested pipelines

This is an important concept, but I think (but may be I wrong) that this concept in not explicitly mentioned in course.

lesteve · 3 May 2021 08:29

I see what you mean, hmmm in my opinion this part of the wrap-up quizz is a bit too hard, especially since this is the first module:

contains imputation and we never talked about imputation before
pipelines for preprocessing only and we did not mention that explicitly

I would be in favour of replacing the nans by hand with some pandas (or maybe scikit-learn if there is an easy way to do it) and have a simpler pipeline.

lesteve · 3 May 2021 08:56

Or to make the improvement quicker, maybe a hint is good enough not sure …

ogrisel · 3 May 2021 10:01

A hint, or even make it explicitly part of the instructions to use a pipelines for each kinds of columns.

lesteve · 4 May 2021 08:50

Tracked in https://github.com/INRIA/scikit-learn-mooc/issues/301

lfarhi · 10 May 2021 16:17