In the lecture “Hyperparameter tuning” it is said:
Finally, we have overlooked the impact of the
learning_rate
parameter until now. When fitting the residuals, we would like the tree to try to correct all possible errors or only a fraction of them. The learning-rate allows you to control this behaviour. A small learning-rate value would only correct the residuals of very few samples. If a large learning-rate is set (e.g., 1), we would fit the residuals of all samples. So, with a very low learning-rate, we will need more estimators to correct the overall error. However, a too large learning-rate tends to obtain an overfitted ensemble, similar to having a too large tree depth.
It does not match with the documentation from sklearn.ensemble.GradientBoostingRegressor — scikit-learn 0.24.2 documentation
Learning rate shrinks the contribution of each tree by
learning_rate
. There is a trade-off between learning_rate and n_estimators.