Question about education-num column

Hello,

In the “Working with numerical data” notebook, it is stated that

However, the column "education-num" is different.

But I can’t see why if I only look at this column’s values.
We’re asked to execute:
data["education-num"].value_counts().sort_index()
to see its specificity, but
data["hours-per-week"].value_counts().sort_index()
gives quite similar output.

Don’t we let “education-num” at part because of our understanding of its meaning, rather than on the studies of the values?

I haven’t read the next notebook, maybe answer lies in it.

Thank you.

Indeed, we don’t want to focus on this feature because it requires some explanation that will be given in the next notebook.

I think that we state:

This feature is indeed a nominal categorical feature. We exclude it from our analysis since particular attention is required when dealing with categorical features. This topic will be discussed in depth in the subsequent notebook.

Are you expecting another message to clarify something?

Hello @glemaitre,

Thank you for your answer.

I was inferring that we were to see that it is a categorical feature only by looking at the values, as we’re told: “to see this specificity”.
Regarding following notebook about categorical data (it isn’t the next one by the way), we’re told that: "In a previous notebook, we saw it is the case with the column "education-num"" .
But I don’t think we’ve seen why this feature is categorical.

Maybe that’s only due to my lack of understanding either the notion or English. If it’s the case, don’t pay too much attention to my comment then.

We removed this part about education-num. To simplify we say in data exploration that education and education-num contains the same information so we can remove education-num later and we never talk about education-num any more.