How to interpret diagonal graph in pairplot display

Hello,
I have difficulties to interpret the diagonal of pairplot diagram.
Intuitively I expect a diagonal line (plot (x,f(x)) where f(x):=x)
Could you give any complementary resource explaining the construction of such a diagram?

Thanks.

3 Likes

Hello @Thierry71,

You are right, plotting a variable against itself will produce a single diagonal. Since that’s not really interesting, Seaborn replaces those plots with a distribution plot of each variable.

Quoting from the Seaborn documentation:

The diagonal plots are treated differently: a univariate distribution plot is drawn to show the marginal distribution of the data in each column.

1 Like

Thanks for your answer.
I will try to understand the “marginal distribution” used in the “univariate distribution” :slight_smile:
The vocabulary of statistics is somewhat unfamiliar to me :-/

1 Like

Got it (but in french) :

and
https://wikis.hu-berlin.de/mmint/Basics:_Marginal_and_Conditional_Distributions/fr

1 Like

It corresponds to the distribution for each individual feature (column). For instance, for the age feature, we plot the histogram that is the number of persons with a specific age for all possible ages. In other words, how many people are between 20-21, 21-22, etc. A distribution normalizes this count by the total amount of people.

In the tutorial, the distributions are computed separately for the two groups low- and high-income. It explains why there is a blue histogram and a orange histogram.

2 Likes

The histograms were a surprise to me, too! (@Thierry71). Thank you for clarifying the Seaborn way.

1 Like