Conclusions from validation and learning curves of Ex M2.01

When interpreting the validation curve, it’s sayed that " while for gamma < 1 , it is not very clear if the classifier is under-fitting ". For me, since the “training accuracy” vary a little bit from 0.77 to 0.82 when gamma vary, then the model still not flexible and we can draw that the classifier is under-fitting.
Also for the learning curve, since the training accuracy and testing accuracy are approximatively constant, when adding more samples, we can deduce that the classifier is under-fitting.

Can you please confirm or decline thoses conclusions.

I would also like to follow on this discussion because the conclusion is not super obvious.

Indeed the error bars are huge for all testing scores, and though we understand the message and see a kind of trend, I am not convinced that a gamma of 1 is the most optimal based on this experiment.

M02

In practice, how should we deal with similar scenarii when the validation curve is not clear-cut?

On your figure, it seems that a gamma around 1 is where the test score is the highest and thus what you are seeking at the end.

Under-fitting is characterized by 2 things:

  • close training and testing errors/scores
  • bad scores or high errors

Here, it is not really clear if we get a bad score. For instance, we have a little gain between the optimal gamma and the smallest gamma. That’s why it is not straightforward to say that the model underfit.