Potential for quiz M3.02 question 4 improvement

F-Dem · 27 May 2021 16:01

Question 4 is still confusing. Indeed, the question is “What do have all these models in common?”
and I think it’s impossible to answer “too large” and “too small” for the same parameter because of the “all”.

Question 5 is a lot clearer in comparison for the same idea about the first parameter.

lesteve · 28 May 2021 08:51

Hi @F-Dem and welcome to the forum since I see this is your first post! I moved your post to a new topic since it did not seem related to the topic you originally posted.

From what I understand you are talking about this question

From what I remember we indeed want something that all bad performing models have in common? If you could explain a bit more what you find confusing this would help us a lot to improve the situation!

Additional constraint: try not to give the answer to the quiz, because we don’t want the quizz answers in the forum

lesteve · 28 May 2021 09:35

Just to clarify when I said all bad performing models I was talking about the models from the random search in the CSV file (we give the code in the question) to do a parallel plot coordinate plot for these models.

We are not making a statement about all bad performing models in the world (this is probably not possible to make such a general statement).

F-Dem · 29 May 2021 14:51

This point it’s ok : obviously the question is on the models that we just investigate.

I try to rephrase my remark with an example to avoid spoiling the answer :
We have models depending on two paramaters, say para_a and para_b both between 0 and 10.
We do the parallel plot coordinate plot and we find :

model1 → para_a = 0 and para_b = 2, score = 0.5
model2 → para_a = 10 and para_b = 3, score = 0.6
model3 → para_a = 5 and para_b = 8, score = 0.95
*model4 → para_a = 9 and para_b = 9, score = 0.6

In this case, expected answers for Q4 is “two small para_a” and “two large para_a” whereas it’s not a “common point” to all bad models having too small para_a : clearly model2 is a bad one but its para_a is not too small!

(But for Q5 I agree that too low or too large para_a are never used to obtain a good model.)

lesteve · 31 May 2021 04:10

OK I think I get your point, thanks for your patience and your detailed explanation! Indeed this question needs to be rephrased …

What do you think about: “Looking at the plot, which parameter values always cause the model to perform badly”.

Better suggestions more than welcome!

F-Dem · 31 May 2021 14:51

It’s perfect… but it’s Q5 ! Maybe, you could merge Q4 and Q5 and keep only the Q5.

ogrisel · 31 May 2021 16:24

I would rather not merge the questions to avoid multiple-choice questions with too many options.

Edit please ignore this reply, it does not make sense in the specific context on this quiz question as the list of possible results are the same for Q4 and Q5. I had another quiz in mind.

lesteve · 31 May 2021 16:34

If I understand correctly the “merging” suggestion, the idea would be to keep only one question, namely Q5.

I agree that Q4 and Q5 are very similar (they are not exactly the same but the difference is tiny). In an ideal world we would remove Q4 and keep only Q5.

A technical hurdle to do this: I am not sure we can remove a quiz question in a running MOOC session so maybe this can only be done for the MOOC version 2, maybe @lfarhi knows more about this.

I agree Q4 and Q5 are quite similar but they are not exactly the same although the difference is tiny …

Q4: select badly performing models (accuracy < 0.8), what properties do some of these models have in common?
Q5: select top-performing models (accuracy > 0.85), what properties prevent a model to be amongst the top performing models

The tiny difference is 0.8 vs 0.85 and so the answer is not exactly the same in both cases …

ogrisel · 31 May 2021 16:37

I think it’s good to keep both as they provide to complementary ways to analyze the results (even if there is some redundancy).

lesteve · 31 May 2021 16:39

OK fair enough so I’ll make my change as mentioned in Potential for quiz M3.02 question 4 improvement - #5 by lesteve

lesteve · 1 June 2021 14:58

I fixed it in https://gitlab.inria.fr/learninglab/mooc-scikit-learn/mooc-scikit-learn-coordination/-/commit/a2e2a8df364273fcd3dd3cd48241511c48e968f3. This needs to be changed in FUN.

michif · 1 June 2021 18:24

I know that this has been accepted as a solution but even with this change, I still think the question is ambigious.
Without giving the answer away, I still think the issue raised by F-Dem has not been resolved. For the sake of the exercise, either adding new answer options or changing the definition of what bad performing models are would, in my view, resolve the issue.