Order of pipeline

Thus, you can create a model that will pipeline the scaler, followed by the imputer, followed by the linear regression .

Just before Question 1, the suggested order for the pipeline is given as scaler -> imputer -> regressor.
Shouldn’t the data be scaled after filling the NaNs (imputer):
imputer -> scaler -> regressor. If not, why is this so?

Thanks!

Ideally that should not matter too much. However if I remember correctly, in this case this would slightly change the results and make answering the quiz problematic. I don’t recall exactly why unfortunately.

1 Like

Yes, exactly. I did it in this order imputer -> scaler -> regressor and the answers were a bit off, but still managed to “guess” the right option.