This question, it requires evaluating the max_iter, however, inside the cross-validate function, the param_range is assigned by the n_estimators array.
Is it true?
Sorry, I don’t understand your question. Can you be more explicit?
I assume it might be linked to HistGradientBoostingClassifier
that has a max_iter
parameter. However, n_estimators
so I assume that you are indeed looking at the GradientBoostingClassifier
instead.
Be sure to use the right class if this is indeed the issue.
In question 6, when I look at the explanation section. Based on my understanding, we try to investigate the influence of parameter max_iter
on the performance of HistGradientBoostingRegressor
. However, in the source code (following), the input array of param_range
is n_estimators
.
from sklearn.ensemble import HistGradientBoostingRegressor
hgbdt = HistGradientBoostingRegressor(random_state=0)
max_iter = [1, 2, 5, 10, 20, 50, 100, 200, 500, 1_000, 2_000, 5_000]
train_scores_hgbdt, test_scores_hgbdt = validation_curve(
hgbdt, data, target, param_name="max_iter", param_range=n_estimators, cv=cv, n_jobs=2
)
I wonder that is it changes the final result.
PS: I fix and ran the program again, and the result remains the same and is just a semantic error.
Indeed this is an error. We wanted to input max_iter
instead. I assume that the previous n_estimators
had probably the same value that’s why it works but it was not intended. We will fix it. Thanks for reporting.
We should fix this error in the next version. I assume it does not change the results because n_estimators == max_iter