2024-10-23

1. Introduction

Learning Objectives

  • You should be able to critically analyse how data is visualised
  • You should be able to judge a figure’s clarity and potential for misunderstanding
  • You should be able to identify potential sources of bias resulting from the visualisation
  • You should understand how to create effective figures for your own work

Background Reading

Exercise: Four figures (assess all figures)

  • For each figure, consider the following:
    • What type of data is being presented?
    • Are the data presented effectively? (why/why not?)
    • How can the data presentation be improved?
    • Use the DOI provided to find the paper the figure is from, if you need more information than is in the figure legend
  • Fill in the pro forma with your answers to the questions above (one sentence each)

Figure 1 (doi:10.1073/pnas.2320257121)

Figure 2 (doi:10.1073/pnas.2405474121)

Figure 3 (doi:10.1073/pnas.2408540121)

Figure 4 (doi:10.1073/pnas.240342112)

2. Summary Results

Responses by figure

  • We received 132 ratings in total (at four figures per student, this is 33 students responding)

Overall effectiveness

  • How was effectiveness scored, distributed across all figures?

Overall understandability

  • How was understandability scored, distributed across all figures?

Overall appeal

  • How was appeal scored, distributed across all figures?

Time taken per figure

  • How long did you take, per figure?

3. Results By Figure

Effectiveness/Understandability/Appeal

  • How effective/understandable/appealing did you think each figure was?

Colours/Fonts/Labels

  • How well did each figure use colours, fonts, and labels?

Statistics/Whitespace/Data

  • How well did each figure use statistics, whitespace, and data?

Reproduction

  • How well did you think you could reproduce each figure?

4. Individual Figures

Figure 1 (doi:10.1073/pnas.2320257121)

Figure 1 (doi:10.1073/pnas.2320257121)

Figure 1 (doi:10.1073/pnas.2320257121)

  • Suggested improvements (LP):
    • The lower extent of error bars is not visible in (B), (D), (G), or (E). Avoid “dynamite plots.”
    • Bar charts should be avoided; 1D scatterplot for (D), (G), and (E) would be clearer. Line plot with concentration on \(x\)-axis would improve (B).
    • We can’t see difference between effects of two concentrations of A16 in (D), or A16 vs A18 in (B); use a table of contrasts.
    • Place things to be compared by the reader next to each other where possible (E).

Figure 1 (doi:10.1073/pnas.2320257121)

  • Suggested improvements (MF):
    • The scale on the micrographs (C) is too small to read easily.
    • In addition to showing DAPI and immunofluorescence images of the cells, we should really be seeing a bright-field micrograph of the cells (no fluorescence).
    • The colour scheme is misleading (compare 1 \(\mu\)M A18 in B vs 5 \(\mu\)M in D)
    • Too many comparisons in B - chartjunk
    • y-axes scales should be the same to make intra-panel comparisons easier

Figure 1 (doi:10.1073/pnas.2320257121)

improvements
reorganising in a way that flows slightly better to the eye
Worth noting that whilst the bar charts they have used are easy to read, they might not be representative of any distribution in results. Alternatives might be histograms or violin plots.
Show data points in scatterplot or box plot
I think the figure legend could be more detailed
The scales on the bar charts are all different so at a glance makes the results seem more similar
better spacing more accurate number
Could box off panels to make more distinct from one another;
Perhaps the error bars on panel B were too large i.e. took up half the space on the graph and could’ve been presented differently
Re-organized
remove the numbers on the bars, have all backgrounds the same colour eg white or black in Figure C, have a chart at the side displaying what colour of bar is what group instead of putting each group name on each bar, remove the lines showing what the ** are referring to in Figure B and the ns sign in Figure E.
Colour shading could be improved, difficult to distinguish between subset of bars
Colour scheme could be easier to see - could be difficult to tell the difference in shades
I think the colours in the bar charts could be clearer.
I believe if dot plots were also plotted so you could see first hand the variation of the data sets.
roviding clear visual representations, detailed legends, and contextual information will aid in communicating the significance of the research.
make bar section for A18 on part E as well
simplify each colour section
The layout of the different components of the figure could be improved. It is not appealing and does not flow very well, making it a bit harder to interpret. To improve the components should all be of equal size which may make it easier to follow.
no idea
it can be improved by making the text size consistent throughout
unsure
Not sure how to fix because they are comparing so many bars in B
n/a
More text on the diagrams, the layout of the various diagrams can be adjusted so that they can be better compared with each other to convey meaning
keep all the different kinds of figures together e.g. bar graphs along side bar graphs, images along side images
making the figures easier to read
maybe less is more
molecular weight
I think it is presented well.
Presenting all the graphs in the same size, making use of more white space as well as using a box plot combined with scatter plot or at least use scatter and the bar.
better use of colours
Not sure
not sure
Mention the N numbers either on the graphs or in the legend instead of just saying at least 3. Maybe try and use other symbols to represent significance lines and clarify what’s being compared in the legend, for example in figure B, it’s mostly just significance lines which clutters the whole graph and makes it look more complicated than it is.
Same colour/shade for same concentration of a particular constituant
bigger labels maybe
Use box and whisker plots with data points overlayed?
Maybe grouping together certain parts of the data to allow better comparison
there isnt much difference in the colour gradient of the green, changing the middle column to a slightly darker colour would make the difference more clear
No requirement to state no significance.
Figure E need a bit of correction as it might be better labelled
n/a
not sure :/
Could be spaced out more, quite cramped
I wouldve liked to see the actual scale bar with the values on the visualisations of the fibroblasts and not just on the figure legend, but this could make the figure look a bit too busy.
The layout can be improved. Put the same kind of charts together.
Split into smaller grouped info
make all parts the same size
explaining any abbreviations e.g. DMSO
Individual data points could be shown. Perhaps a boxplot with jittered scatter plot.
N/A
split up into separate figures to make info more digestable
The data presentation in this figure could be improved by increasing the font size of labels and numbers in the bar graphs for better readability, and explicitly stating the exact sample sizes for each condition in the legend.
more what to look at instructions of the immunofluorescence confocol image
a few things i would do differently would be the stats. the way they are layed out makes it harder to read for me personally but i can see how this might be useful for other people
the treatment length was different for each group, should been the same, because then they are not really comparable.

Figure 2 (doi:10.1073/pnas.2405474121)

Figure 2 (doi:10.1073/pnas.2405474121)

Figure 2 (doi:10.1073/pnas.2405474121)

  • Suggested improvements (LP):
    • UMAP plots (B, E) are highly manipulable and clustering/placement does not necessarily reflect objective measures.
    • Unpleasant colour choices in (C); there is room for aesthetic improvement.
    • The proportion plot in (C) does not give information on absolute number, only proportion; a proportional areas plot spanning all clusters would more honestly represent the data.
    • Heatmap text is too small to read comfortably; is there too much data here?

Figure 2 (doi:10.1073/pnas.2405474121)

  • Suggested improvements (MF):
    • The flow diagram (A) could make better use of arrows to illustrate order of steps
    • Text overall is too hard to read comfortably
    • Heatmap in (D) is missing a scale (is purple high and yellow low, or vice versa?)
    • Consider what is needed to convey the figure’s intended message: if it’s just that these macrophages exhibit transcriptional heterogeneity, then D is probably sufficient for that purpose - the other panels don’t add much for me
    • Whitespace usage could be improved - cramming (C) under the inset from (B) makes the figure feel very crowded

Figure 2 (doi:10.1073/pnas.2405474121)

improvements
snapshot of D and reoutput of C
Including more white space, improving figure legend, alternative methods of data visualisation.
Simplied figures, less unessessary colour and text
make the figure legend more straightforward
make some text bigger
less colours and images made bigger
less colours and less content heavy data, split into more figures
the presentation could be improved by adding more detailed descriptions of the statistical methods for each figure, particularly for the transcription heatmaps, and including more explicit explanations for non-significant results in the bar charts. additionally, color schemes could be optimized for accessibility
The presentation could be improved by adding more detailed descriptions of the statistical methods for each figure, particularly for the transcription heatmaps, and including more explicit explanations for non - significant results in the bar charts, additionally, colour schemes could be optimized for accessibility
shouldve been split into seperate figures to allow the reader to understand all of them
present the data in an easier, more readable format - some forms i.e. particularly panel C do not actually explain what is going on in the figure, just stating the type of graph used
maybe explain what each figure actually shows
Space out the rows in C as hard to follow across
Specifically part D of the figure is harder to decipher due to the small text.
label more stuff explain why the genes are relevant
Tidy up
Selection of key data and highlighting of relevant information in different characters
Less info
change to colour scheme and layout of all graphs. Statistical data (e.g. sample size) needs to be included in legend.
I’m not sure what the best option for this data would be
less bright colours and more monochromatic colours
Could be formatted differently or separated into different figures.
Perhaps by seperating the figure into two parts it would make it easier to focus on the information rather than having it all in one, or else if possible reducing the amount of names in the Y axis in section D to be more readable.
I think that they can reduce the number of figures
Clearly define the borders/limits of each figure
.
increase space in between sections so they are clearer to distinguish e.g. between a and b
add full borders around each part might help reading flow
no
Genuinely? Haven’t a clue
Perhaps breakdown the heatmap into parts or focus on key results only.
I think less colour could have been used as the data would not be interpretable to a colour blind audience. I also think a collection of smaller bar charts for C instead would be better
split it up, explain what each par is
Could be improved by adding in some qauntifiable data with better figure legends i.e explaining what UMAP actually means.
better organisation for one
Less colours

Figure 3 (doi:10.1073/pnas.2408540121)

Figure 3 (doi:10.1073/pnas.2408540121)

Figure 3 (doi:10.1073/pnas.2408540121)

  • Suggested improvements (LP):
    • The lower extent of error bars is not visible in (D) or (E). These are “dynamite plots,” which should be avoided.
    • Bar charts should be avoided in general; a 1D scatterplot of each dataset in (D) and (E) would be clearer.

Figure 3 (doi:10.1073/pnas.2408540121)

  • Suggested improvements (MF):
    • Failure to complement the triple mutant strain???
    • Fluorescence wavelength not specified (C)
    • Colour scheme in (A) is inconsistent
    • Axis label in (B) could be improved
    • Figure legend for (A) could be more succinct/some of this info should be in the paper text instead

Figure 3 (doi:10.1073/pnas.2408540121)

improvements
Perhaps the flow of the figure could be improved. The way the individual panels are labelled (A-E) doesn’t appear to have as logical a flow as it could.
State what the appreviations mean
By using a larger number scale for figure D and also a larger figure in general
reordering of panels - this didn’t make sense to me as it read in a very confusing way
I think using a bar graph for parts D and E was not so easy to understand as it is just an average of all four so does not give much information
changing the colour/ shading of parts D and E to make the genotype bars more different
perhaps label in different order; diagram A, growth curve B, 2 bar graphs C and D, and photographs as E
-
it couldve been improved by using slightly more contrasted colours in the d and e diagrams which maybe couldve matched with figure b like the orange
had to look up every 2nd word
Figures 1D & 1E could be presented as box plots to convey statistical information about the data.
Signalling pathway maps can be further annotated with key information such as NAG etc.
use scatter plots instead of bar charts
using the same colours for the mutation and control groups in all figures
Part B in the legend mentions “Growth curve in TY.”, unless I am missing something, I don’t see what this means anywhere. Maybe it is a microbiology term I am not familiar with since I do biomedical science but maybe it should specify?
N/A
N/A
not sure
Have more line spacing to make the figure legend easier to read
By adding colour to some of the diagrams.
Unsure
maybe colour coding the bar charts
the bar graphs ideally need to be shades further apart and the label on C i would put at the top as opposed to in the figure

Figure 4 (doi:10.1073/pnas.240342112)

Figure 4 (doi:10.1073/pnas.240342112)

Figure 4 (doi:10.1073/pnas.240342112)

  • Suggested improvements (LP):
    • The rifampicin structure is purely decorative and could be removed.
    • The lower extent of error bars is not visible in (B). This is a “dynamite plot,” which should be avoided.
    • Bar charts should be avoided in general; a 1D scatterplot of each dataset in (B) would be clearer.
    • The implied membrane in (C) and (D) could be stated as such in the figure.

Figure 4 (doi:10.1073/pnas.240342112)

  • Suggested improvements (MF):
    • Colours in (A) difficult to distinguish, especially with the red boxes which seem to skew blue closer to purple
    • The Western blot in (A) showing 2 cut-out bands is absolutely not an appropriate way to present this type of data - show the whole thing
    • Colour scheme in (B) doesn’t seem purposeful and doesn’t add anything to the figure
    • Text in (D) is too small to read easily
    • By convention in microbiology, would assume that the periplasm/extracellular space is “up” and the cytoplasm is “down” in (C) and (D) - but this should really be labelled to avoid any potential confusion

Figure 4 (doi:10.1073/pnas.240342112)

improvements
im not sure
Change figure A
Different colours, as green salmon and gold don’t look too appealing tt=ogether
Change A to a bar graph, showing significance values and actual analysis, move western blotting to perhaps B where mutant type is more relevant and add significance to B. Simplify C and D to become more clear at the characterisation, showing labeled image first and perhaps adding the topology as visual aid, highlighting same area’s to colour
its confusingly laid out, with B under A rather than across from it the author also doesn’t make it clear what the red boxes are around the western blot results the author also should’ve plotted the results of the western blot to better visualise changes in the data A should also be labelled to make it clearer to the reader and B should have statistics on it about the significance level
explain why the structures are relevant
less boring
Addition of two-factor data analyses and representation on graphs
changing colour schemes and including statistical data
Arrange the figure legend just slightly to flow more smoothly with the movement of the figure. Instead of (A) in the legend going top, bottom then middle, follow the direction of the figure - top, middle then bottom.
There was no statistical analysis on the bar graph.
use more colour-blind friendly colours
NA
Explaining the significance of the structural figures in the context of the study
The inclusion of showing the wild type and E161Q western blot doesn’t feel like it should be in the same section of the figure as the rifampicin and ethambutol plates
split the protein structures into a different figure
for one increase the size of the structure of rifampicin and have it as a part A or remove it completely

5. Summing Up

General Comments

  • Colour choices
  • Larger figures/graphs, more space between figures/graphs
  • Too much data per figure
  • Split into multiple figures
  • Remove unnecessary data (how do we define this?)
  • “The data is presented in a manner that would likely be inaccessible for people without prior experience. A move toward a more palatable/digestible format will facilitate better science communication in the future.”

Visualising Data About Data Visualisation

  • What did you say about figure effectiveness?

Visualising Data About Data Visualisation

  • What words did you use to describe potential figure improvements?

Data Visualisation is Not Neutral