Shouldn't n_neighbors=1 always get 100% training accuracy?

ErickSantos · 13 May 2022 13:43

Shouldn’t KNeighborsClassifier(n_neighbors=1) always get 100% training accuracy?

But the results I’ve got were:
train_score 0.882552
test_score 0.483987

How could a 1 neighbor knn get any prediction wrong on the training set it has memorized?
The only explanation I can imagine is that there are a few data points with exactly the same values for the independent variables but differing target values, so that even memorizing the results from the training data the model couldn’t tell which is the expected target value given those ambiguous values for the independet variables… Is that it?

glemaitre58 · 13 May 2022 15:19

Exactly.

Let’s imagine that 2 persons have the same feature but one earns more than 50 k$ and one under 50 k$.
Then you will make at least one mistake.