I too had difficulties with this exercise. I have not encountered numpy.ravel
or numpy.reshape
before, so had no knowledge of these and therefore could not make use of either. I went through the code in the notebook and determined that there is an easy fix. If the first block of code supplied is:
import pandas as pd
penguins = pd.read_csv("../datasets/penguins_regression.csv")
feature_name = "Flipper Length (mm)"
target_name = "Body Mass (g)"
data, target = penguins[feature_name], penguins[target_name]
(Note the change to the last line!)
Then we can write a function for the goodness of fit measure very simply, for example:
def goodness_fit_measure(true_values, predictions):
return sum((true_values - predictions)**2)
Is the learning objective for this exercise to understand the parametrization of a linear model and determine how we may quantify the goodness of fit of the model, or is it about software carpentry? If the former, I think the exercise would benefit from doing away with the need to be familiar with numpy.ravel
or numpy.reshape
. True, we may be clients of a method or function that returns data in a form that is less than ideal for downstream processing, and being able to deal with such situations is a skill well worth acquiring, but I do not believe that was the intended focus of this exercise.