Automated Tuning

While using nested CV, for hyperparameter tuning we are giving 2 values for classifier__learning_rate and classifier__max_leaf_nodes, so ideally there should be 4 grid points/different models, but cv_results returns only 3, may I know why is that?

from sklearn.model_selection import cross_validate
from sklearn.model_selection import GridSearchCV
 
param_grid = {
    'classifier__learning_rate': (0.05, 0.1),
    'classifier__max_leaf_nodes': (30, 40)}
model_grid_search = GridSearchCV(model, param_grid=param_grid,
                                 n_jobs=4, cv=2)
 
cv_results = cross_validate(
    model_grid_search, data, target, cv=3, return_estimator=True)

I think that you are missing a point here:

  • GridSearchCV will perform a grid-search using cross-validation when fit will be called
  • cross_validate will perform a cross-validation to evaluate the best model found by GridSearchCV

Thus, there are two cross-validation: (i) an inner cross-validation implemented by GridSearchCV and (ii) and outer cross-validation when calling cross_validate.

cv_results contains the results of the outer cross-validation. You have 3 entries because cv was set to 3. If you want to check the results of the inner cross-validation performed by the grid-search, you need to access the stored estimator in cv_results. For instance you can access the grid-search of the first iteration with:

cv_results["estimator"][0]

and the results of the inner cross-validation with

cv_results["estimator"][0].cv_results_