Storytelling

The relationship between data visualisation and storytelling

1 Introduction

Data visualisation is not often just a simple rendering of the data. Fundamentally, data visualisation is a method of communication, and it is worth thinking of it as a form of storytelling, similar to writing. We may as scientists have specific goals for data exploration, such as to maximise patterns that may indicate correlation between variables - to tell ourselves a story from the data. When writing reports and papers, we want to share what we see in the data with our reader. We want to tell them a story in pictures, to share our mental model with them.

There are strong similarities to storytelling with words. For instance, there is a grammar to visual information (called gestalt principles), just as there is a grammar to written language. As humans we have evolved to use this grammar to recognise patterns in what we see, including visual images on the page, and interpret them as having meaning. We can use those natural, innate tendencies to recognise and interpret patterns to make understanding and interpretation of data almost effortless for the reader.

Figure 1: Pareidolia: the human brain incorrectly interprets images to have meaning even when there is no meaning present. This Martian mesa appears to have a face but, regardless of what any YouTube documentaries or bros in tinfoil hats might tell you, there is in fact no face there. Image by NASA (Public Domain)
Warning

It is also possible to use these natural ways that we interpret data to mislead people - accidentally or otherwise. We will see some examples of this.

If you know that it is possible to be misled by visual representation, you will be better able to spot bad graphics in papers. By the end of this page, you’ll also be better able to mislead people.

With great power comes great responsibility…

2 A scientific story

Here’s a story as it might appear in a presentation or a paper, told with a single figure:

2.1 A story: “AvrY induces chlorosis more strongly than AvrX”

Suppose you’re reading a manuscript, and the text makes this claim:

We measured chlorosis in 30 plants after the application of either AvrX or AvrY. Figure 2 demonstrates that induction of chlorosis was stronger for AvrY than for AvrX.

Figure 2: Bar chart showing amount of chlorosis in leaves after application of AvrX and AvrY
Callout-questionsQuestions
  1. How does Figure 2 support the statement that AvrY induces chlorosis more strongly than AvrX?
  2. What information is present in Figure 2?
  3. What information do you think you need that is not present in Figure 2?
  1. Which bar is larger? By how much?
  2. What is the extent of induction of chlorosis in each case? How variable is the extent of chlorosis in individual leaves?
  3. Is the difference in chlorosis biologically meaningful?

As it stands we dont know what the chlorosis measurements are, which means we can’t interpret them as “strong” or “weak”, and we can’t tell how different the values are for AvrX and AvrY. We can improve this plot by indicating what quantities are meant by the bars in the chart, adding values on the y-axis:

Figure 3: Bar chart showing effect of chlorosis due to application of AvrX and AvrY, with observed chlorosis values indicated on the y-axis
Callout-questionsQuestions
  1. What conclusions can you draw from Figure 3 that couldn’t be drawn from Figure 2?
  2. Is Figure 3 a fair representation of the data?
  3. Does the figure still support the statement that AvrY induces chlorosis more strongly than AvrX?
  1. Which bar is larger? By how much?
  2. Is the difference in bar sizes proportionate to the difference in extent of chlorosis? How variable are the measurements?
  3. Is the difference in chlorosis biologically meaningful?

Now we can read off the chart that AvrX has a value of 9.5 and AvrY has a value of about 9.75. If we knew more about chlorosis, we could now tell if this was a strong or a weak effect.

But there is something not right about this graphic. The blue bar looks to be twice the size of the green block, yet the difference between values is only \(\frac{0.25}{0.95} = 2.5\%\)! It would be fairer to start the y-axis at zero, so that the areas of each bar are related to the absolute difference between observed values.

Research shows that people do, in fact, interpret bar charts by bar area, not bar height. Humans have a limited repertoire of elementary perceptual tasks that we use to extract data and information from graphical representation (Cleveland and McGill (1984)). Graphs may use these well for communication, or misuse them to fool the reader.

A number of studies empirically test people’s ability to correctly “read” graph types, and the results indicate that bar charts do not perform very well (Heer and Bostock (2010)).

Figure 4: Bar chart showing effect of chlorosis due to application of AvrX and AvrY, with the y-axis starting at zero, and values marked
Callout-questionsQuestions
  1. What conclusions can you draw from Figure 4 that couldn’t be drawn from Figure 3?
  2. Does Figure 4 tell the same story as Figure 3?
  3. Is there anything else we still need to know?
  1. Is the difference in bar sizes proportionate to the difference in extent of chlorosis?
  2. Is the difference in chlorosis biologically meaningful?
  3. How variable are the measurements for AvrX and AvrY?
Warning

When we read the literature, it is tempting to skim over papers and “read the figures,” trusting that they tell an unbiased and fair story about the data. That is not always true. Figures should be read at least as carefully as the text.

There is one more thing we need to make proper sense of the figure. At the moment, the values for AvrX and AvrY are presented as single values: one location for each gene. But we saw at the start of the story that 30 plants were measured:

We measured chlorosis in 30 plants after the application of either AvrX or AvrY. The figure demonstrates that the induction of chlorosis was stronger for AvrY than for AvrX.

It’s unlikely that all of the measurements for each gene were identical. We would assume that there has been some spread of data around that location. This spread is commonly represented with error bars, as in Figure 5.

Figure 5: Bar chart showing effect of chlorosis due to application of AvrX and AvrY, with the y-axis starting at zero, values marked on that axis, and error bars indicating standard deviation
Callout-questionsQuestions
  1. What conclusions can you draw from Figure 5 that couldn’t be drawn from Figure 5?
  2. Does Figure 5 tell the same story as Figure 4?
  3. Is there anything else we might still want to know?
  1. How variable are the measurements for AvrX and AvrY?
  2. Is the difference in bar sizes proportionate to the difference in extent of chlorosis?
  3. What do the error bars mean? Do they fairly represent the spread of measurements that were made?
Tip

As it happens, there is more that we might want to know, and you can explore this in the interactive session at the link below:

References

Cleveland, William S, and Robert McGill. 1984. “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods.” J. Am. Stat. Assoc. 79 (387): 531. https://doi.org/10.2307/2288400.
Heer, Jeffrey, and Michael Bostock. 2010. “Crowdsourcing Graphical Perception.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. New York, NY, USA: ACM. https://doi.org/10.1145/1753326.1753357.