Cv_results["estimator"]

I can’t understand how to retrieve weights.
here the code is:

weights = pd.DataFrame(
[est.coef_ for est in cv_results[“estimator”]], columns=data.columns)

In another example of the course this is
coefs = [pipeline[-1].coef_[0] for pipeline in cv_results[“estimator”]]
coefs = pd.DataFrame(coefs, columns=data.columns)

when it seemed to me that the cases were identical ?
Could you explain in a more detailed way the structure of cv_results[“estimator”]

In this case, the estimator passed to cross_validate is directly the linear model, e.g. LinearRegression.

An instance of this LinearRegression will therefore have a coef_ fitted attribute.

In this case, the model used is not a single LinearRegression but a Pipeline, e.g. make_pipeline(StandardScaler, LinearRegression). Therefore, the linear model in this pipeline will have a coef_ fitted attribute meaning that you need to access the fitted instance of LinearRegression. pipeline[0] corresponds to the instance of a StandardScaler while pipeline[-1] is the last step of the pipeline and therefore a LinearRegression.

Thus, to access the coef_ of the LinearRegression that is stored in a Pipeline, you need to write pipeline[-1].coef_.

It explains the slight difference between the two lines of code.

1 Like

I think the use of a pipeline should’ve been included in this particular exercise. The idea that the pipeline is a “list” of transformers and an estimator is most clearly explained by pipeline[-1]. Awesome exercise!