Surprised how the target could be a string

Why do we keep target as a string? Is it converted to int automatically by scikit-learn?

1 Like

Yes, the target in most classification tasks is internally encoded by scikit-learn.
See for instance this line of code used in the base algorithm for Logistic Regression and LinearSVC/LinearSVR.

For KNeighborsClassifier encoding is not even necessary, as the algorithm computes pair-wise distances of the features and then directly reads the target label of the closest sample(s).