Error about learning_rate

In the lecture “Hyperparameter tuning” it is said:

Finally, we have overlooked the impact of the learning_rate parameter until now. When fitting the residuals, we would like the tree to try to correct all possible errors or only a fraction of them. The learning-rate allows you to control this behaviour. A small learning-rate value would only correct the residuals of very few samples. If a large learning-rate is set (e.g., 1), we would fit the residuals of all samples. So, with a very low learning-rate, we will need more estimators to correct the overall error. However, a too large learning-rate tends to obtain an overfitted ensemble, similar to having a too large tree depth.

It does not match with the documentation from sklearn.ensemble.GradientBoostingRegressor — scikit-learn 0.24.2 documentation

Learning rate shrinks the contribution of each tree by learning_rate . There is a trade-off between learning_rate and n_estimators.

Both docstrings are indeed related. “contribution of each tree” is referring to the speed at which a tree will correct the residual of a boosting iteration.

1 Like