Excuse me, I set max_iter (maximum number of trees in HistGradientBoostingClassifier) but If I need to know the numbers of trees to convergence, how can I obtain that number?
Thanks
You can force early_stopping=True
and put a large number of iteration. In this case, the training will stop when the convergence requirements are met. These requirements are based on tol
and n_iter_no_change
. To know the size of the example once trained, you can check the fitted attribute model.n_iter_
.
Thank you, I have launched this code:
from sklearn.experimental import enable_hist_gradient_boosting
from sklearn.ensemble import HistGradientBoostingClassifier
hist_gbdt = HistGradientBoostingClassifier(
max_iter=1000, early_stopping=True, random_state=0
)
cv_results = cross_validate(
hist_gbdt, data, target, cv=10, scoring="balanced_accuracy", n_jobs=2,
return_estimator=True,
)
cv_results["test_score"].mean()
and the when I need to know the number on average of trees to convergence I do not cath the correct sentence, can you help me, please?
So you need to access to the fitted estimator stored in the cv_results
. For instance, to access to the estimator fitted on the first CV iteration, you can do:
cv_results["estimator"][0].n_iter_
So you can check for all CV iterations and average them.