Hello,
In First look at our dataset notebook, Module 1, tabular data exploration section
,
The rows represents a record
and
The columns represents a type of information collected
I think there is an issue with plural form of the word and singular form of the verb.
I suggest to replace “The rows” and “The columns” by “Each row” and “Each column”.
And I would have written “a type of collected information”.
Later, there is a missing “one” in
We can compute the number of features by counting the number of columns and subtract 1, since one of the column is the target.
Later, there is a missing ‘s’ in
In a machine-learning setting, an algorithm automatically creates the “rules” in order to make predictions on new data.
Now, I think that those ‘s’ aren’t correct :
Values towards 0 (dark blue) indicates that the model predicts
low-income
with a high probability. Values towards 1 (dark orange) indicates that the model predictshigh-income
with a high probability. Values towards 0.5 (white) indicates that the model is not very sure about its prediction.
In the following quiz, in explanation of the first question’s answer, there is a missing ‘s’ :
The string given to the
pd.read_csv
function indicates the relative path where the physical CSV file is located.
In second question’s answer, I would have removed the “to” :
It can also compute simple summary statistics (counting unique values or computing the median or mean value of a column) or to do graphical visualizations (with the help of matplotlib under the hood).
Shouldn’t third question be:
How is a tabular dataset organized?