Conclusion of M3.01?

In the text you asked we take a look to what is happening when we use a larger training set but you did not show the results in the solution.

When I tested the code with a training set at 80% I see that best score is 0.870 for a learning_rate at 0.1 and a max_leaf_nodes at 30. These results are very near of the results for the full data set (best score is 0.872 for a learning_rate at 0.1 and a max_leaf_nodes at 30).

Conclusion could be that, max_leaf_nodes is the parameter to tune since he is the one that change when size of the data is increasing?

Thank for your answers

I did not try the code but in theory, keeping the learning_rate as-is, it is reasonable to think that max_leaf_nodes could be tune as well as max_iter. Having more samples mean that you might need more tree to correct the residuals of the tree in the ensemble. Adding trees will help for that. max_leaf_nodes will help to get deeper individual trees. By tuning, you should have an interaction between the two parameters.

Regarding the parameter of the gradient-boosting, the notebooks on the ensemble will present the parameters more in details.