Exercize M2.01 error ValueError: could not convert string to float: 'donated'

mab66 · 1 June 2021 16:57

Hello, my code is nearly the same as the correction but i get this error

entry 8 is

from sklearn.model_selection import validation_curve
import numpy as np
range_gamma = np.logspace(-3, 2, num=30)
train_scores, test_scores = validation_curve(
    clf, data, target, param_name="svc__gamma", param_range=range_gamma,
    cv=ShuffleSplit(random_state=0), scoring="neg_mean_absolute_error", n_jobs=2)

clf is the pipeline

clf = make_pipeline(StandardScaler(), SVC())

nineggs · 1 June 2021 19:58

I think you do not need to provide a scoring parameter in your “validation_curve” as the score is already calculated between the target_predicted and the target_test.

We used it in the lecture because our prediction was linear and not categorical hence we needed to compute our way out to evaluate the score.

glemaitre58 · 1 June 2021 20:58

The real issue is that you are using a classifier (SVC()) and a regression metric (neg_mean_absolute_error). The error given by scikit-learn is not super informative thought.

From the error, I assume that you try to solve a classification problem. The target contains the classes: donated and not donated. Thus you need to use a classification metric when running the validation curve: 3.3. Metrics and scoring: quantifying the quality of predictions — scikit-learn 0.24.2 documentation

Probably accuracy would be enough and as mentioned by @nineggs, it would be this metric used if you don’t specify scoring.

mab66 · 2 June 2021 08:11

Thank you, that’s it

mab66 · 2 June 2021 08:15

Thank you Guillaume, with this error i understand the differences between metric and categorical scoring errors, that i did not understood before…