Question 5 Numerical features

from sklearn.model_selection import train_test_split
data_train,data_test,target_train,target_test = train_test_split(
numerical_features,target,random_state=42)

When ever I run this code I get the following error
ValueError: Found input variables with inconsistent numbers of samples: [24, 1460]
How should I approach the question?

Hi. I did something like this:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

splitting data in train and test sets

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(???)

from sklearn.pipeline import make_pipeline
from sklearn.linear_model import LogisticRegression
model = make_pipeline(???)

#fitting data into the model
model.fit(???)

from sklearn.model_selection import cross_validate
cv_results = cross_validate(???)
cv_results

Cross-Validation Score

from sklearn.model_selection import cross_val_score
print(cross_val_score(???))

I put ??? to not reveal my code :slight_smile:
Hope it helps.

Notice that numerical_features contains the names of the numerical features (24 names), not the data restricted to those columns i.e. data[numerical_features].