Trees number?

efenaux · 1 April 2022 12:38

Hi,

Text mentions : “You can set the trees number to be large”
I guess we should read max_depth ?

ArturoAmorQ · 1 April 2022 15:54

The number of trees is controlled by max_iter. I know it is not a trivial link, but remember that the number of iterations of a boosting process corresponds to the number of trees. You can get more info in the scikit-learn documentation.

efenaux · 1 April 2022 19:53

thanks, I understood that reading the doc and the end of the exercise.
Is there a reason why the parameter is called max_iter in HistGradientBoostingRegressor and n_estimators in GradientBoostingRegressor ?

Thanks

glemaitre58 · 3 April 2022 09:50

Historical reason
GradientBoostingRegressor was implemented many years ago. At that time, a choice was made to make it look alike the RandomForestRegressor in terms of API.

However, this algorithm is just an optimization process where we minimize a loss (similarly to the linear model). Adding a new tree will only reduce the training loss and represent an iteration. It should be noted that this is not the case with the RandomForestRegressor. Therefore, max_iter would reflect better what is happening under the hood.