Unable to download the csv file of the dataset on openml

Dear all,

I have started the training this morning and I am unable to download the dataset as indicated on openml. The link is not working. Can we have it here?

1 Like

You don’t need to download it from OpenML, it suffice to run the first cell of the notebook to load it on FUN, or are you trying to run the notebook locally on your computer?

It’s available in the following link.
http://www.openml.org/d/1590

The default format is .arff format. I didn’t have any clue about how to handle such a file, so used an online csv converter and it worked.

Hope this helps.

1 Like

You can use fetch_openml from scikit-learn: sklearn.datasets.fetch_openml — scikit-learn 1.0.2 documentation

You can load directly any dataset from OpenML directly in a pandas dataframe or NumPy array.

I had the same problem (if I can avoid using the notebook, I prefer), I could convert it with the script given here : GitHub - ayushkurlekar/arff_to_csv_converter: Convert arff files to CSV files in Python

Does that mean that we cannot just access it from a script running locally.
When I did that I got the following error:
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 25, saw 365
I used this command in my script:
ames_housing = pd.read_csv(“scikit-learn-mooc/ames_housing_no_missing.csv at main · INRIA/scikit-learn-mooc · GitHub”)

where am I wrong?
Thanks

You should use the “raw” version. Otherwise, this is just the HTML GitHub Rendering.
Use the following link:

https://raw.githubusercontent.com/INRIA/scikit-learn-mooc/main/datasets/ames_housing_no_missing.csv

thanks, that has solved the problem.
I used this link because I wanted initially to use the file locally but whatever I do I get this error message when I copy paste the content to a local csv file.
SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated \UXXXXXXXX escape
How do I download the file to use it properly?
Thanks

You can follow the URL and “save as” the CSV file.