Cv (fold parameter)

nktnlx · 10 April 2022 07:54

Is there any way to “gridsearch” cv fold parameter via sklearn capacities for GridSearchCV or RandomSearchCV? Will such search have any sense? Or using the default value equal to 5 works well for the majority of cases a common user can have?

glemaitre58 · 10 April 2022 16:01

This is not a parameter that you should tune.
A good value will always be a trade-off: the more splits you have, the more computational time it will take and the more data you need to have (to not have too few samples in a fold). The latter point can be alleviated by using a RepeatedKFold.

Having many splits allows to have a better sense of the score distribution and thus a sense of the uncertainty.

nktnlx · 13 April 2022 16:57

Thank you for you reply and a luminous explanation.
Enjoying the course a lot!