Hi Alvin,
first I show code with instantiation to have an easier understanding :
scores = cross_val_score(model, data_train, target_train, cv=2)
mean_score = scores.mean()
print(f’scores: {mean_score:.3f}’)
if mean_score > best_score:
best_score = mean_score
best_params = {‘learning_rate’: lr, ‘max leaf nodes’: mln}
print(f’Found new best model with score {best_score:.3f}!’)
Before that best_score has been initialised with a value of 0 and best_params as an empty dict.
In the 2 for
loop the parameter learning_rate
et max_leaf_nodes
will be choosen in a list of values and for each parameter the scores of your model will be evaluated by cross_val_score
and stoked as a array of 2 elements (cv=2).
Then the mean of the 2 scores will be stocked in the variable mean_score
and you compare the mean_score to the best_score. If the mean_score
is better than the best score then best_score
take as value mean_score
and the params are stocked in the dict best_params
.
At each new turn of the loop new scores will be found via cross_val_score(model, data_train, target_train, cv=2)
, mean_score
will then have a new value that will be compared to the actual best_score
value.
So at the first turn of the loop you ll have automaticaly a new best model, since the first model tested will automatically produce a mean_score
> 0 and best_score
will take the value of the mean_score
of the first model tested. At the second turn you’ll have a new best_score
only if the mean_score
of the second model is greater than the mean_score of the first model. And so on …
If python the value of a variable is not static and can be dynamically changed.
I hope is clearer for you