Question 4 (Nan value)

Hi, everyone. I tried to evaluate the accuracy of the decision tree model but I get nan value for the test score. Any hints how to troubleshoot this problem? Thanks in advance.

NaN is returned when the score could not be computed (due usually to an error). First could you show the variable scores to check if only one of the fold as a NaN value.

Then, you can add error_score="raise" in the cross_validate call to obtain the error

Note: Be aware that you are using a typical preprocessing of a linear model while using a decision tree. Usually, we are only using an OrdinalEncoder for tree-based model.

This is what cv_results look like when printed:

  {'fit_time': array([0.00198603, 0.00196958, 0.0018189 , 0.00192094, 0.00179267,
       0.00178552, 0.00173926, 0.00177884, 0.00173688, 0.00172091]),
'score_time': array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]),
'estimator': [Pipeline(steps=[('columntransformer',
                  ColumnTransformer(transformers=[('ordinalencoder',
                                                   OrdinalEncoder(handle_unknown='ignore'),
                                                   ['MSZoning', 'Street',
                                                    'Alley', 'LotShape',
                                                    'LandContour', 'Utilities',
                                                    'LotConfig', 'LandSlope',
                                                    'Neighborhood', 'Condition1',
                                                    'Condition2', 'BldgType',
                                                    'HouseStyle', 'RoofStyle',
                                                    'RoofMatl', 'Exterior1st',
                                                    'Exterior2nd', 'MasVnrType',
                                                    'ExterQual', 'ExterCond',
                                                    'Foundation', 'BsmtQual',
                                                    'BsmtCond', 'BsmtExposure',
                                                    'BsmtFinType1',
                                                    'BsmtFinType2', 'Heating',
                                                    'HeatingQC', 'CentralAir',
                                                    'Electrical', ...]),
                                                  ('standardscaler',
                                                   StandardScaler(),
                                                   SimpleImputer())])),
                 ('decisiontreeregressor', DecisionTreeRegressor())]),
 Pipeline(steps=[('columntransformer',
                  ColumnTransformer(transformers=[('ordinalencoder',
                                                   OrdinalEncoder(handle_unknown='ignore'),
                                                   ['MSZoning', 'Street',
                                                    'Alley', 'LotShape',
                                                    'LandContour', 'Utilities',
                                                    'LotConfig', 'LandSlope',
                                                    'Neighborhood', 'Condition1',
                                                    'Condition2', 'BldgType',
                                                    'HouseStyle', 'RoofStyle',
                                                    'RoofMatl', 'Exterior1st',
                                                    'Exterior2nd', 'MasVnrType',
                                                    'ExterQual', 'ExterCond',
                                                    'Foundation', 'BsmtQual',
                                                    'BsmtCond', 'BsmtExposure',
                                                    'BsmtFinType1',
                                                    'BsmtFinType2', 'Heating',
                                                    'HeatingQC', 'CentralAir',
                                                    'Electrical', ...]),
                                                  ('standardscaler',
                                                   StandardScaler(),
                                                   SimpleImputer())])),
                 ('decisiontreeregressor', DecisionTreeRegressor())]),
 Pipeline(steps=[('columntransformer',
                  ColumnTransformer(transformers=[('ordinalencoder',
                                                   OrdinalEncoder(handle_unknown='ignore'),
                                                   ['MSZoning', 'Street',
                                                    'Alley', 'LotShape',
                                                    'LandContour', 'Utilities',
                                                    'LotConfig', 'LandSlope',
                                                    'Neighborhood', 'Condition1',
                                                    'Condition2', 'BldgType',
                                                    'HouseStyle', 'RoofStyle',
                                                    'RoofMatl', 'Exterior1st',
                                                    'Exterior2nd', 'MasVnrType',
                                                    'ExterQual', 'ExterCond',
                                                    'Foundation', 'BsmtQual',
                                                    'BsmtCond', 'BsmtExposure',
                                                    'BsmtFinType1',
                                                    'BsmtFinType2', 'Heating',
                                                    'HeatingQC', 'CentralAir',
                                                    'Electrical', ...]),
                                                  ('standardscaler',
                                                   StandardScaler(),
                                                   SimpleImputer())])),
                 ('decisiontreeregressor', DecisionTreeRegressor())]),
 Pipeline(steps=[('columntransformer',
                  ColumnTransformer(transformers=[('ordinalencoder',
                                                   OrdinalEncoder(handle_unknown='ignore'),
                                                   ['MSZoning', 'Street',
                                                    'Alley', 'LotShape',
                                                    'LandContour', 'Utilities',
                                                    'LotConfig', 'LandSlope',
                                                    'Neighborhood', 'Condition1',
                                                    'Condition2', 'BldgType',
                                                    'HouseStyle', 'RoofStyle',
                                                    'RoofMatl', 'Exterior1st',
                                                    'Exterior2nd', 'MasVnrType',
                                                    'ExterQual', 'ExterCond',
                                                    'Foundation', 'BsmtQual',
                                                    'BsmtCond', 'BsmtExposure',
                                                    'BsmtFinType1',
                                                    'BsmtFinType2', 'Heating',
                                                    'HeatingQC', 'CentralAir',
                                                    'Electrical', ...]),
                                                  ('standardscaler',
                                                   StandardScaler(),
                                                   SimpleImputer())])),
                 ('decisiontreeregressor', DecisionTreeRegressor())]),
 Pipeline(steps=[('columntransformer',
                  ColumnTransformer(transformers=[('ordinalencoder',
                                                   OrdinalEncoder(handle_unknown='ignore'),
                                                   ['MSZoning', 'Street',
                                                    'Alley', 'LotShape',
                                                    'LandContour', 'Utilities',
                                                    'LotConfig', 'LandSlope',
                                                    'Neighborhood', 'Condition1',
                                                    'Condition2', 'BldgType',
                                                    'HouseStyle', 'RoofStyle',
                                                    'RoofMatl', 'Exterior1st',
                                                    'Exterior2nd', 'MasVnrType',
                                                    'ExterQual', 'ExterCond',
                                                    'Foundation', 'BsmtQual',
                                                    'BsmtCond', 'BsmtExposure',
                                                    'BsmtFinType1',
                                                    'BsmtFinType2', 'Heating',
                                                    'HeatingQC', 'CentralAir',
                                                    'Electrical', ...]),
                                                  ('standardscaler',
                                                   StandardScaler(),
                                                   SimpleImputer())])),
                 ('decisiontreeregressor', DecisionTreeRegressor())]),
 Pipeline(steps=[('columntransformer',
                  ColumnTransformer(transformers=[('ordinalencoder',
                                                   OrdinalEncoder(handle_unknown='ignore'),
                                                   ['MSZoning', 'Street',
                                                    'Alley', 'LotShape',
                                                    'LandContour', 'Utilities',
                                                    'LotConfig', 'LandSlope',
                                                    'Neighborhood', 'Condition1',
                                                    'Condition2', 'BldgType',
                                                    'HouseStyle', 'RoofStyle',
                                                    'RoofMatl', 'Exterior1st',
                                                    'Exterior2nd', 'MasVnrType',
                                                    'ExterQual', 'ExterCond',
                                                    'Foundation', 'BsmtQual',
                                                    'BsmtCond', 'BsmtExposure',
                                                    'BsmtFinType1',
                                                    'BsmtFinType2', 'Heating',
                                                    'HeatingQC', 'CentralAir',
                                                    'Electrical', ...]),
                                                  ('standardscaler',
                                                   StandardScaler(),
                                                   SimpleImputer())])),
                 ('decisiontreeregressor', DecisionTreeRegressor())]),
 Pipeline(steps=[('columntransformer',
                  ColumnTransformer(transformers=[('ordinalencoder',
                                                   OrdinalEncoder(handle_unknown='ignore'),
                                                   ['MSZoning', 'Street',
                                                    'Alley', 'LotShape',
                                                    'LandContour', 'Utilities',
                                                    'LotConfig', 'LandSlope',
                                                    'Neighborhood', 'Condition1',
                                                    'Condition2', 'BldgType',
                                                    'HouseStyle', 'RoofStyle',
                                                    'RoofMatl', 'Exterior1st',
                                                    'Exterior2nd', 'MasVnrType',
                                                    'ExterQual', 'ExterCond',
                                                    'Foundation', 'BsmtQual',
                                                    'BsmtCond', 'BsmtExposure',
                                                    'BsmtFinType1',
                                                    'BsmtFinType2', 'Heating',
                                                    'HeatingQC', 'CentralAir',
                                                    'Electrical', ...]),
                                                  ('standardscaler',
                                                   StandardScaler(),
                                                   SimpleImputer())])),
                 ('decisiontreeregressor', DecisionTreeRegressor())]),
 Pipeline(steps=[('columntransformer',
                  ColumnTransformer(transformers=[('ordinalencoder',
                                                   OrdinalEncoder(handle_unknown='ignore'),
                                                   ['MSZoning', 'Street',
                                                    'Alley', 'LotShape',
                                                    'LandContour', 'Utilities',
                                                    'LotConfig', 'LandSlope',
                                                    'Neighborhood', 'Condition1',
                                                    'Condition2', 'BldgType',
                                                    'HouseStyle', 'RoofStyle',
                                                    'RoofMatl', 'Exterior1st',
                                                    'Exterior2nd', 'MasVnrType',
                                                    'ExterQual', 'ExterCond',
                                                    'Foundation', 'BsmtQual',
                                                    'BsmtCond', 'BsmtExposure',
                                                    'BsmtFinType1',
                                                    'BsmtFinType2', 'Heating',
                                                    'HeatingQC', 'CentralAir',
                                                    'Electrical', ...]),
                                                  ('standardscaler',
                                                   StandardScaler(),
                                                   SimpleImputer())])),
                 ('decisiontreeregressor', DecisionTreeRegressor())]),
 Pipeline(steps=[('columntransformer',
                  ColumnTransformer(transformers=[('ordinalencoder',
                                                   OrdinalEncoder(handle_unknown='ignore'),
                                                   ['MSZoning', 'Street',
                                                    'Alley', 'LotShape',
                                                    'LandContour', 'Utilities',
                                                    'LotConfig', 'LandSlope',
                                                    'Neighborhood', 'Condition1',
                                                    'Condition2', 'BldgType',
                                                    'HouseStyle', 'RoofStyle',
                                                    'RoofMatl', 'Exterior1st',
                                                    'Exterior2nd', 'MasVnrType',
                                                    'ExterQual', 'ExterCond',
                                                    'Foundation', 'BsmtQual',
                                                    'BsmtCond', 'BsmtExposure',
                                                    'BsmtFinType1',
                                                    'BsmtFinType2', 'Heating',
                                                    'HeatingQC', 'CentralAir',
                                                    'Electrical', ...]),
                                                  ('standardscaler',
                                                   StandardScaler(),
                                                   SimpleImputer())])),
                 ('decisiontreeregressor', DecisionTreeRegressor())]),
 Pipeline(steps=[('columntransformer',
                  ColumnTransformer(transformers=[('ordinalencoder',
                                                   OrdinalEncoder(handle_unknown='ignore'),
                                                   ['MSZoning', 'Street',
                                                    'Alley', 'LotShape',
                                                    'LandContour', 'Utilities',
                                                    'LotConfig', 'LandSlope',
                                                    'Neighborhood', 'Condition1',
                                                    'Condition2', 'BldgType',
                                                    'HouseStyle', 'RoofStyle',
                                                    'RoofMatl', 'Exterior1st',
                                                    'Exterior2nd', 'MasVnrType',
                                                    'ExterQual', 'ExterCond',
                                                    'Foundation', 'BsmtQual',
                                                    'BsmtCond', 'BsmtExposure',
                                                    'BsmtFinType1',
                                                    'BsmtFinType2', 'Heating',
                                                    'HeatingQC', 'CentralAir',
                                                    'Electrical', ...]),
                                                  ('standardscaler',
                                                   StandardScaler(),
                                                   SimpleImputer())])),
                 ('decisiontreeregressor', DecisionTreeRegressor())])],
'test_score': array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan])}

I tried passing error_score=“raise” in the cross_validate call and they show the following error:

ValueError: No valid specification of the columns. Only a scalar, list or slice of all integers or all strings, or boolean mask is allowed

So the error is raised because something is wrong in the definition of the ColumnTransformer and more precisely regarding the definition of one of the preprocessors.

Looking closely at your pipeline, you wrote a line to define the numerical preprocessor in this manner:

(StandardScaler(), SimpleImputer(), numerical_columns)

I just recall one the instruction in the wrap-up just to spot the difference:

Be aware that you can pass a Pipeline as a transformer in a ColumnTransformer. We give a succinct example where we use a ColumnTransformer to select the numerical columns and process them (i.e. scale and impute). We additionally show that we can create a final model combining this preprocessor with a classifier.

scaler_imputer_transformer = make_pipeline(StandardScaler(), SimpleImputer())
preprocessor = ColumnTransformer(transformers=[ 
   ("num-preprocessor", scaler_imputer_transformer, numerical_features)
])
model = make_pipeline(preprocessor, LogisticRegression())

You can see that to pass multiple transformers, one needs to create a pipeline. You cannot pass the scaler and the imputer one after the other. You need to pipeline them with for instance make_pipeline(StandardScaler(), SimpleImputer()).

Hope it helps.

1 Like