I am confused why there are 5 columns of scores when doing the below? I only understand the 30 rows, each being a n_estimator
tried.
from sklearn.ensemble import AdaBoostRegressor
from sklearn.model_selection import validation_curve
adaboost = AdaBoostRegressor()
param_range = np.unique(np.logspace(0, 1.8, num=30).astype(int))
print(param_range)
# validation_curve()
train_scores, test_scores = validation_curve(
estimator = adaboost,
X = data_train, y = target_train,
param_name = "n_estimators",
param_range = param_range,
scoring = "neg_mean_absolute_error",
n_jobs = -1
)
# Get errors
train_errors, test_errors = -train_scores, -test_scores
# Why are there 5 columns?
train_df = pd.DataFrame(train_errors)
Also, when I looked at print(adaboost.n_estimators)
after the validation, I saw that the output was 50 instead of 63 (the highest n_estimator in param_range). What is this number pointing to?
Thanks!