Averaging alpha over the CV folds

In the very last part of the notebook, it states that it is common practice to average alpha over all the folds. Is there a paper or a reference that discuss using the average of alphas over the CV folds?

Secondly, since the alphas were sampled in logspace with base 10, does it make sense to take go back to the uniform space for averaging? For example:

np.power(10, np.mean(np.log10(best_alphas)))
1 Like

In the very last part of the notebook, it states that it is common practice to average alpha over all the folds. Is there a paper or a reference that discuss using the average of alphas over the CV folds?

Not sure. The alternative would be to average the CV score curves and keep the alpha value that minimize the average CV score curve.

Indeed an average in log space might make more sense and in absolute space.

EDIT: we should probably change content of this notebook to not recommend to average the minimum of the best alpha values on each CV fold but instead average the CV scores of each CV fold and then then take the value of alpha that minimize the average test score.

1 Like

Hello!
Do you mean it is better to choose alpha in a such way? Did not fully get your point (cv_alphas the same as in the notebook).