Discussion: is the LogisticRegresion better than the Dummy for the excercise?

Literally speaking, as the exercise solution explains, it does improves around 5% in comparison to the dummy higher frequency classifier.

Is this a correct assumption for a day-to-day use case? I.e., would 5% improvement / 82% accuracy be reasonable for real world use cases?

On my exercise I ended up concluding that although it was better, it was still quite close to just predicting the higher frequency, therefore not considering it “good”.


The difference between two models, will be very dependent on the use case under consideration.

In practice whether a model is good or not depends on the end goal you have in mind.

In a research setting, how would this 5% improvement would help to solve the particular scientific problem you are trying to tackle?

In a business settings, how would this 5% improvement turn into an improvement for the business (cost savings, shorter waiting times, etc …).

It seems like this blog post talks a bit about this kind of thing