Q4 Confusing answer

I find the answer of Question 4 to be very confusing.

SPOILERT TO THE ANSWER

I don’t understand why OverallQual and OverallCond are consdired ordinal categories and not included in the answer, for me the logic is the same as the YearBuilt feature, it is an ordinal number but it is related to the state of the house. The numbers of OverallQual and OverallCond are not arbitrary, a higher number means a better house, and IMO it should be treated as a numerical value. I did not get this answer correct due to this and I don’t think it is fair.

1 Like

Opinion and quality scores are usually considered as ordinal categorical variables.

In general, whenever you can interpret numerical operations on such scores (other than only averaging), you are automatically attributing numerical characteristics to said scores. Suppose for example that someone rates one movie as 5 and and another as 10. Then it is difficult to imagine what it would mean to say that they liked one movie ‘twice as much’ as the other. In this case, you cannot think of a house being “twice preserved/deteriorated” as another one.

Because of that, you can think of quality scores as already encoded with an OrdinalEncoder. In that sense, quality is similar to the example presented in the Encoding of categorical variables notebook where we give the example of a categorical variable named "size" with categories such as “S”, “M”, “L”, “XL” that we map to increasing integers such as 0, 1, 2, 3.

2 Likes