Q5: Should be mentioned "Stricly better than" instead of "better than"

Mentalearner · 3 March 2022 09:04

In the Question 5 we have to compare models and one is considered better if:

at least 7 of the cross-validations scores are better

That is the case if better means better or equal (what I thought but got the answer wrong because in the explanation, it means stricly better:

5-NN with StandardScaler is strictly better than 5-NN with MinMaxScaler for ...

That is very confusing because a lot of tests scores are equals. The question needs to be clearer

ArturoAmorQ · 3 March 2022 09:45

Just to make clear we are on the same page, are you comparing the cross-validation test scores of both models fold-to-fold, i.e. counting the number of folds where one model has a better test score than the other, as mentioned in Question 4?

In that sense I would argue that:

at least 7/10 folds is substantially better;
4 to 6 folds out of 10 is considered equal;
3/10 or less is substantially worse.

Mentalearner · 3 March 2022 12:08

Thanks for your reply,

I am talking about Question 5. statement a)
In the explanation if you make the inequality not strict by replacing

f"{sum(score_reference_model > score_other_model)} CV iterations "

by

f"{sum(score_reference_model >= score_other_model)} CV iterations "

we’ll see that the statement holds:

Thus, a 5-NN model with a StandardScaler does ~~not~~ perform substantially better than the models that use alternative scaling strategies.

Fox-PF · 4 March 2022 15:47

I agree this question is a little confusing, especially when you count it manually…
the code proposed to count is not completely obvious to me …

ArturoAmorQ · 8 March 2022 09:19

Thanks for your feedback I am tagging this as a priority to review for version 3.0 of the MOOC