Order of pipeline

thatgeeman · 12 July 2021 14:19

Thus, you can create a model that will pipeline the scaler, followed by the imputer, followed by the linear regression .

Just before Question 1, the suggested order for the pipeline is given as scaler -> imputer -> regressor.
Shouldn’t the data be scaled after filling the NaNs (imputer):
imputer -> scaler -> regressor. If not, why is this so?

Thanks!

ogrisel · 12 July 2021 16:49

Ideally that should not matter too much. However if I remember correctly, in this case this would slightly change the results and make answering the quiz problematic. I don’t recall exactly why unfortunately.

thatgeeman · 13 July 2021 07:10

Yes, exactly. I did it in this order imputer -> scaler -> regressor and the answers were a bit off, but still managed to “guess” the right option.