Hello,
this course is very challenging and interesting, thank you.
I think I got the concepts and methods, but I have some difficulties to grasp fully the step 4 :
numerical_columns = [
"age", "education-num", "capital-gain", "capital-loss",
"hours-per-week"]
This is ok, I understand the purpose.
categorical_columns = [
"workclass", "education", "marital-status", "occupation",
"relationship", "race", "sex", "native-country"]
same logic, same comprehension for me.
all_columns = numerical_columns + categorical_columns + [target_column]
OK, we have 3 lists of columns, including one single.
adult_census = adult_census[all_columns]
That’s where I’m stuck : it seems so redundant for me!
Is this necessary as part of the data preparation, i.e. grouping the columns by usage before passing the data to the functions ? are these names mandatory ?
Thank you for your help