What about Coverage?

Dear Instructors,
Is this sentence true?
“A Classification model must have 100% coverage (overall) on the training dataset.”
Actually, this is an exercise in the book entitled “Data Mining and Machine Learning: Fundamental Concepts and Algorithms” by Mohammed J. Zaki and Wagner Meira, Jr, and their answer surprisingly is: True!!!
I wonder why that sentence considered true? Have the authors of this book made a mistake in answering this question? Have they ignored the presence of noise in the training dataset?
Thanks

I don’t really know what is 100% coverage means? Do you have a definition or something like this that you could provide?

With thanks for your attention,
Yes, in the book, this concept is considered equivalent to “recall,” and it is stated that:
The fraction of correct predictions over all points for a certain class. Must we have 100% recall (overall) on the training dataset? It doesn’t seem sensible.

Yes indeed, 99% of the time, it will never be the case. The remaining 1% is the case that you achieve perfect classification either because your model memorizes the dataset or because the classification problem is simple. But this is not a general rule.

1 Like