K-neighbors fit to non-numerical data?

Hi,
When trying in the Module 1, the K-nearest model, when executing
model.fit(data, target)
I found the error:
ValueError: could not convert string to float: ’ Private’

After looking at the description in https://scikit-learn.org/ it seems to me that it is needed to drop from the data set the non-numerical columns.
Does this K-nearest algorithm trains with string fieatures?

Thanks
–Fernando

The program I run is the following. The error appears in part #3

#1
adult_census = pd.read_csv(“C:/Users/Fernando/datasets/phpMawTba.csv”)
data = adult_census.drop(columns=[target_name, ])
target_name = “class”
target = adult_census[target_name]

#2
from sklearn import set_config
set_config(display=‘diagram’)

#3
from sklearn.neighbors import KNeighborsClassifier
model = KNeighborsClassifier()
model.fit(data, target)

ValueError: could not convert string to float: ’ Private’

1 Like

Hello @FernandoB2B,

The K-nearest model computes the distance between samples. Without pre-processing, this can only be applied to numeric values. You should either use the dataset adult-census-numeric.csv instead or drop non-numeric columns from the one you are using before fitting.

Thank you very much.

1 Like