@LearnerJoe , mate I figured why we got those answers! In our column transformer, numerical features are processed prior to categorical features so the order of feature_names is different.
The code snipped in the quiz assumes that the categorical processor comes first:
feature_names += numerical_columns
The work around for our order of preprocessing is:
feature_names = numerical_columns + feature_names
I searched for the weights of pairs using:
print(f"Weights of pair 1, hours-per-week & native-country_Columbia:",
f"{coefs_clf_df2['hours-per-week'].mean():0.2f} and {coefs_clf_df2['native-country_ Columbia'].mean():0.2f}")
print(f"Weights of pair 2, workclass_? & native-country_ ?:",
f"{coefs_clf_df2['workclass_ ?'].mean():0.2f} and {coefs_clf_df2['native-country_ ?'].mean():0.2f}")
print(f"Weights of pair 3, capital-gain & education_Doctorate ?:",
f"{coefs_clf_df2['capital-gain'].mean():0.2f} and {coefs_clf_df2['education_ Doctorate'].mean():0.2f}")
This time the results lined up with box plot visualisation.