Proportional Area Plots

Visual representation of relationships between categorical datasets

Graphical relationships representing comparisons of categorical variables aren’t very common. They are conceptually not very difficult, but few tools have provided a ready way to generate suitable figures (scatter plots, bar charts, and pie charts are more readily found). However, tools like GGally in R make complex graphs accessible, and that package provides a figure style that is a grid of rectangles with proportional areas.

In Figure 1, the proporition of Titanic survivors by sex is shown, where larger rectangles indicate a larger absolute count (here, area is proportional to count).

Figure 1: Proportional area comparison of Titanic survivors by sex.
Note

This representation is an intuitive graphical version of the comparisons in a chi-square test.

One advantage of this representation is that we can use the stacked bar representation (see above) to subdivide each of the rectangular areas by a further categorical variable, as in Figure 2 where we divide the blocks representing passengers conditioned on sex and survival, according to class.

Figure 2: Proportional area comparison of Titanic survivors by sex, with stacked bar representation of class.