iris
dataset, with jittered scatterplot.
1 Combining Plots
No single visual representation is perfect, and it is often useful to combine representations to combine their advantages, or offset disadvantages, as with Figure 1, Figure 2, and Figure 3), below.
Tools like R
and ggplot
make these complex figures straightforward to generate. Tools like Excel
do not.
iris
dataset, with jittered scatterplot.
iris
dataset, with jittered scatterplot.
1.1 Bar Chart
For comparison, we present a common (and less informative) literature representation of this kind of data: a bar chart with error bars showing standard deviation of each dataset.
iris
dataset, with error bars representing standard deviation
- Which visualisations do you think made it easiest for you to interpret the data?
2 Pairs Plots
Tools like GGally
and ggplot2
in R
provide a pairs plot graphics that combine the best-practice versions of the above representations to get a quick overview of a dataset. The iris
and titanic
datasets are summarised below in Figure 5 and Figure 6.
iris
data, providing an overview of relationships between variables.
titanic
data, providing an overview of relationships between variables.
- Which data representations does the pairs plot use?
- Which visualisation is used for comparing each type of data?
- Do you agree that the right choice was made for each data type?
- How would you refine these pairs plots?