Coefficients in pipeline

FabioCLima · 13 July 2021 14:43

About the coeffs. from linear regression on pipeline, I’ve found mine using this kind of inspection.

I’ve could be able to answer the question, but then I was check the graded solution, I know it is list comprehension, but I not be able translate the idea, do you mind explain to me.

ogrisel · 14 July 2021 10:15

Your solution is correct but just displays the coefficients for the model trained on the first iteration of cross-validation.

Cross-validation typically train 5 to 10 (or sometimes more) models on various random splits of the data. So for each model you get a different set of coefficients.

The list comprehensions therefore return a list of 5 numpy arrays instead of one: you can check by just executing this line of the solution in a notebook cell and display the value of the coefs variable without converting it to a pandas dataframe.

Maybe also: estimator[-1].coef_ inside the list comprehension is equivalent to writing estimator["linearregression"].coef_ since estimator is a pipeline in this notebook. estimator[-1] is a way to access the last step of the pipeline, which is the LinearRegression instance in this case.

ogrisel · 14 July 2021 10:17

I think this is a case of an example of a solution where we use advanced python and API constructs that were not necessarily well presented in the previous notebook.

We should probably rewrite the solution to use a more straightforward code that does not introduce any novelty to focus on the newly introduced ML concepts rather than confusing them with the complexity of new programming aspects.

FabioCLima · 14 July 2021 11:10

Thank you so much, when I think I understand python better, I got hit by simplest idea (in this case hit by index of slice - seems a truck…ahahaha). I’ve just express my way to understand the idea, in order to really understand, I’ve come to check all the steps in the pipeline. Anyways, merci for you attention and congrats to team for the great content. I’m felling more confident that I already have the means to work with scikit-learn. Great content, great support and you guys really push us to improve through the course.

Best regards,
Fabio Carvalho Lima (a brazilian data scientist always looking for improve myself)

ogrisel · 14 July 2021 19:23

Thanks for the valuable feedback. If you struggled on this line, I am sure that at least 100 other people struggled on the same line and were to shy to report it in the forum.

We will try to be more careful not to use programming constructs not explained in the main notebooks in the future versions of the exercises and wrap up quizzes.