Q5/6_code for accuracy score

mahend58_sit · 16 April 2022 12:27

Hi, can anyone validate my learning, is I am doing right?
Please teach me if i’m doing anything wrong here in the following questions:

Q5:

Q6:

ArturoAmorQ · 19 April 2022 09:11

For the Q5 it looks fine, just make sure your definition of num_col correspond to the provided numerical_features as defined by

numerical_features = [
  "LotFrontage", "LotArea", "MasVnrArea", "BsmtFinSF1", "BsmtFinSF2",
  "BsmtUnfSF", "TotalBsmtSF", "1stFlrSF", "2ndFlrSF", "LowQualFinSF",
  "GrLivArea", "BedroomAbvGr", "KitchenAbvGr", "TotRmsAbvGrd", "Fireplaces",
  "GarageCars", "GarageArea", "WoodDeckSF", "OpenPorchSF", "EnclosedPorch",
  "3SsnPorch", "ScreenPorch", "PoolArea", "MiscVal",
]

For Q6 you are no longer asked for the mean test score. Instead, your are asked to compare the cross-validation test scores of both models fold-to-fold, i.e. counting the number of folds where one model has a better test score than the other.

When you compare models A and B fold-to-fold, you could have for instance that the score in the first fold is better for model A

cv_results_A["test_score"][0] > cv_results_B["test_score"][0]

but have that

cv_results_A["test_score"][i] < cv_results_B["test_score"][i]

for i an integer between 1 and 9. This account for the whole 10 folds as you were asked to set cv=10. This would read something like

“The model A is performing better than the model B in 1 out of 10 folds.”

mahend58_sit · 22 April 2022 11:00

got it, thank you so much @ArturoAmorQ