Inquiry about coefficient

Alvin19 · 28 June 2021 02:36

Hi, in Lecture: Linear model for classification,

The coefficient:

logistic_regression[-1].coef_
array([[10.57295179, -4.39030324]])

logistic_regression[-1].coef_[0]
array([10.57295179, -4.39030324])

While when I check on the dimension:

np.ndim(logistic_regression[-1].coef_[0])
1
np.ndim(logistic_regression[-1].coef_)
2

When we want to know the number of dimentsion, we can check from the square bracket of the coefficient value. Based on this, the dimension is correct.

When calling logistic_regression[-1].coef_ , does it means we are calling to display the whole coefficient value?
When calling logistic_regression[-1].coef_[0] means we are calling for coefficient for variable 1 while logistic_regression[-1].coef_[1] is for variable 2?

If yes, when calling logistic_regression[-1].coef_[1], it display an message “IndexError: index 1 is out of bounds for axis 0 with size 1”

culmen_columns = [“Culmen Length (mm)”, “Culmen Depth (mm)”]

I think this is because when we construct the culmen_columns (the column for x variables) we put 2 variables in a list. Therefore, when we call .coef_ and .coef_[0] it will display 2 coefficients together. Then .coef[1] (suppose for variable 2) is not available.

Does my understanding is correct? Is this normally happens and it is not a must to follow the index rule that [0], [1], [2] is for item 1, 2 and 3 element respectively, in the list ?

glemaitre58 · 28 June 2021 06:50

You can have a look at the documentation:

https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

coef_ : ndarray of shape (1, n_features) or (n_classes, n_features)

Having a 2-D array with a single row is just to be consistent with the multiclass case.
When, you write coef_[1], you try to access the second row that does not exist because there is a single row.