Use of numpy

I have a problem with the use of numpy : Python doesn’t accept the instruction diag_kws={‘bins’: 30}.
The TypeError is : Cannot interpret ‘<attribute ‘dtype’ of ‘numpy.generic’ objects>’ as a data type
And yet I verified I use the last version of numpy
Have any of you had this issue before and how did you resolve it?
Thanks for your help

Are you running the notebooks locally in your computer? If so, try updating pandas as well.

First Thank you for your answer
Yes I am running the notebooks locally in my computer
I did :
!pip3 install numpy --upgrade
and
!pip3 install pandas --upgrade
But unfortunately,I have always the same error message

It seems that it tries to convert adult_census into float which is not possible

In my opinion, it can’t work because some features are strings that cannot be interpreted by Python: these strings should all be converted into interpretable numerical values such as, for example, for sex, we could assign the value 0 to male and 1 to female.
What do you think ?

Could you provide the code that you try to execute such that we can try to reproduce the error and provide you with a proper explanation?

import seaborn as sns

# We will plot a subset of the data to keep the plot readable and make the
# plotting faster
n_samples_to_plot = 5000
columns = ["age", "education-num", "hours-per-week"]
_ =sns.pairplot(
    data=adult_census[:n_samples_to_plot],
    vars=columns,
    hue=target_column,
    plot_kws={"alpha": 0.2},
    height=3,
    diag_kind="hist"
    ,diag_kws={'bins': 30},
)
TypeError                                 Traceback (most recent call last)
<ipython-input-24-43073718075a> in <module>
     12     height=3,
     13     diag_kind="hist"
---> 14     ,diag_kws={'bins': 30},
     15 )
...........
TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type

I have an other explanation. When I write :
print(f"The dataset contains {adult_census.shape[1] - 1} features.")
The result is :The dataset contains 48842 samples and 15 columns
While for you, the result is :The dataset contains 48842 samples and 14 columns
From where comes this difference ?

Please report a minimal Python code snippet of maximum 10 lines including all import statements to reproduce the problem. of your machine and check that the same code snippet does not happen in the sandbox notebook of the fun platform.

Please format the code snippet in the forum using the triple backticks markers to wrap the code block (or use the ctrl-E keyboard shortcut).

Then compare the numpy and pandas version on both machines:

import pandas, numpy

print("pandas", pandas.__version__)
print("numpy", numpy.__version__)

I have in addition one columns named “fnlwgt” when I write the Python code :
import pandas as pd
adult_census = pd.read_csv(“C:/Users/thier/Downloads/phpMawTba.csv”)
import numpy as np
adult_census.columns
Maybe the phpMaw database I uploaded is not the good one ?

Indeed, we removed this column for the sake of simplicity and to improve the the educational value of this notebook early in the MOOC.

Please tel me how to easily remove this column from my phpMaw file

You can use:

adult_census = adult_census.drop("fnlwgt", axis="columns")

Please refer to a pandas tutorial for more info on data loading and transformation, for instance, starting at:

https://pandas.pydata.org/docs/user_guide/10min.html

I have pandas version 1.0.1 while you work with 1.4.2
When I do : !pip3 install pandas --upgrade
I am not sure it works
How can I install the more recent release of Pandas ?

Do you get any output when doing so?

I have the following output :

Requirement already satisfied: pandas in c:\users\thier\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (1.4.2)
Requirement already satisfied: pytz>=2020.1 in c:\users\thier\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from pandas) (2022.1)
Requirement already satisfied: python-dateutil>=2.8.1 in c:\users\thier\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from pandas) (2.8.2)
Requirement already satisfied: numpy>=1.18.5 in c:\users\thier\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from pandas) (1.22.3)
Requirement already satisfied: six>=1.5 in c:\users\thier\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from python-dateutil>=2.8.1->pandas) (1.15.0)

Now it works perfectly without changing anything ! :slight_smile: :slight_smile:

Well, we are glad to hear that!