Q5: Should be mentioned "Stricly better than" instead of "better than"

In the Question 5 we have to compare models and one is considered better if:

at least 7 of the cross-validations scores are better

That is the case if better means better or equal (what I thought but got the answer wrong because in the explanation, it means stricly better:

5-NN with StandardScaler is strictly better than 5-NN with MinMaxScaler for ...

That is very confusing because a lot of tests scores are equals. The question needs to be clearer

1 Like

Just to make clear we are on the same page, are you comparing the cross-validation test scores of both models fold-to-fold, i.e. counting the number of folds where one model has a better test score than the other, as mentioned in Question 4?

In that sense I would argue that:

  • at least 7/10 folds is substantially better;
  • 4 to 6 folds out of 10 is considered equal;
  • 3/10 or less is substantially worse.

Thanks for your reply,

I am talking about Question 5. statement a)
In the explanation if you make the inequality not strict by replacing

f"{sum(score_reference_model > score_other_model)} CV iterations "

by

f"{sum(score_reference_model >= score_other_model)} CV iterations "

we’ll see that the statement holds:

Thus, a 5-NN model with a StandardScaler does not perform substantially better than the models that use alternative scaling strategies.

I agree this question is a little confusing, especially when you count it manually…
the code proposed to count is not completely obvious to me … :slight_smile:

Thanks for your feedback :slight_smile: I am tagging this as a priority to review for version 3.0 of the MOOC