Small things wrt "Using numerical and categorical variables together"

Ahoi hoi folks,

as I have various smaller points I would like to mention/outline, I thought creating a comprehensive post would be better than creating one for each. However, if the latter is preferred, I can of course also adapt things accordingly. Please just let me know. Within the comments/points, typos and words/phrases that should be added/changed are marked in bold .

  • cell 19: “We can observe that we get significantly higher accuracies with the Gradient Boosting model.“ → can/should this statement be made? It might give participants the impression that such claims wrt model comparison are possible without further quantification, the non-independence of overlapping training sets
  • last bullet point in summary: “have seen …”
  • quiz, question 3, option c): “…category

I hope the points/comments are understandable. If not, please let me know if you have questions.

HTH, cheers, Peer

I fixed the two last easy points. About “significantly” I am going to open a separate topic. I agree that we use significantly in some parts where we should either:

  • not use it and stay vague
  • explain what we mean by significantly in this context