A few comments :
-
Unless
"fnlwgt"
is used later in the MOOC, I suggest just getting rid of it at the beginning of each course. It is explained in the first course why we get rid of it, I don’t think it is necessary to discuss it everytime. -
It would be interesting to show what
train_test_split
has done in practice (number of sample in each dataset for instance). -
In general, I’m wondering is using “created” when you do
model = LogisticRegression()
is not misleading. It could imply (for a person not familiar with python) that the model is ready to generate predictions. Maybe using “intitiated” instead would be less confusing…or maybe “create” is the standard way of calling this action. -
You should define what “cross-validation” is in the “Caution!” inset
-
At the end, I would have liked to see something to visualize the rule predicted by the model rather than just the score.