A nitpick on one of the answers to Question 3. Specifically:
One-hot encoding will…encode a single string-encoded column into a single integer coded column
I guessed correctly that this is supposed to be false but there are situations where this is true! Specifically, if there are only two categories in the column, and that drop="first"
is specified.
A minimal example:
from sklearn.preprocessing import OneHotEncoder
import pandas as pd
df = pd.DataFrame([
{"binary_category": "yes"},
{"binary_category": "no"},
{"binary_category": "no"},
{"binary_category": "yes"},
{"binary_category": "yes"}
])
df
ohe = OneHotEncoder(drop="first", sparse=False)
ohe.fit_transform(df)
I think in the next version of this quiz this question should be adjusted