Wrong conclusion

Wrong conclusion
I don’t agree with the conclusion given in the part “Using numerical and categorical variables together”:
Data type > score

  • numerical > 0.802 +/- 0.003
  • categorical > 0.872 +/- 0.003 (best)
  • numerical AND categorical > 0.851 +/- 0.003

Am I the only one?

For categorical only with a LogisticRegression, I get:

The accuracy is: 0.833 +/- 0.002

Could you please give us the pipeline you used to reach 0.872 +/- 0.003 ?

model2 = make_pipeline(OneHotEncoder(handle_unknown=“ignore”),
LogisticRegression(max_iter=500))
cv_result2 = cross_validate(model2, data, target, cv=5, error_score=“raise”)

This is the same model as the one in the notebook that gave me the score above. Could you make sure to run the notebook from start to end?