Is this the workflow of GridSearchCV?

I apologize for the flurry of questions, but I think that understanding the SearchCV methods and nested CV is very important for using scikit-learn properly, so please bear with me :sweat_smile: I promise this will be my last question for this module! Can you please confirm that this is the workflow followed by GridSearchCV? I consider the default CV strategy (5-fold CV).

Given a hyperparameter grid of CodeCogsEqn (7) points, where d is the number of hyperparameters and CodeCogsEqn (2) is the number of levels for hyperparameter i:

  • for each grid point CodeCogsEqn (10) in [CodeCogsEqn (9)]:
    • set model hyperparameters to CodeCogsEqn (10)
    • for each of the 5 (train, test) splits, fit model on train, compute score on test
    • average the 5 test_scores
    • store the result in mean_test_score[i]

Finally, find the index ibest such that mean_test_score[ibest] is maximum, and refit the model on the whole dataset (unsplitted) using the corresponding hyperparameter setup CodeCogsEqn (11). Correct?

1 Like

Yes, it is. Be aware the refitting is controlled by a parameter activated by default.

1 Like