• Reproducibility and replicability are central to the integrity of scientific research
    • The terms may not always be used consistently in the community
  • They represent the ideas of independent researchers verifying a result and independent researchers finding the same result using an independent dataset
  • The data, methods, analysis, and results of scientific work should be open and transparent to ensure reproducibility and replicability
  • FAIR principles describe how research data can be made reusable and interoperable for automated analyses to support reproducibility, replicability, and future research

1 Reproducibility and Replicability

Scientific and medical research depends on us all as scientists generating robust and reliable new knowledge. With technological advances such as high-throughput assays and genomics, we are producing data and publications at a much faster rate than ever before. Research is increasingly interdisciplinary, and the questions and systems we investigate are ever more complex. The demands on time and effort to ensure the credibility of published science are huge, and it is essential for us as scientists to take personal responsibility to make our research reproducible and replicable1.

There are two distinct concepts encapsulated in the terms reproducibility and replicability:

  1. Applying the original researcher’s analysis to the dataset that was originally used will regenerate the reported results, even if performed by a different research team.
  2. A new research team, collecting their own data and potentially using an alternative analytical or experimental methodology, will arrive at the same result.

The main difference is that concept in (1) does not require regeneration of data (or new experiments), but the concept in (2) does require this and is closer to the idea of independent verification.

Confusingly, some disciplines (e.g. biomedical science and statistics), refer to (1) as reproducibility and (2) as replicability, where others (including microbiology) flip the definitions and refer to (1) as replicability and (2) as reproducibility2.

1.1 Reproducible Research and Independent Verification

Regardless of which definition is assigned to reproducibility or replicability, there is a growing culture of reproducible research in science, where the complete and/or cleaned dataset and the analysis pipeline or workflow used to generate a result are made available. This is intended to make research transparent so that it can be verified and built upon by other research teams.

Verification of existing results is an important part of ensuring the correctness of the scientific record, and can be considered a minimum standard for judging a scientific claim, but it is not sufficient in isolation to establish the “truth” of the result. A single well-conducted experiment establishes evidence for a scientific result, but this evidence is strengthened when the result can be replicated by multiple independent research teams, each generating their own independent data.

Both validation and independent verification benefit from the original work being transparent, with the original data being made available and the research protocol being clearly described. The 2013 paper by Sandve et al.3 describes ten simple rules for making computational research reproducible. These are focused on computational work, including data analysis, but some are general principles:

  1. For Every Result, Keep Track of How It Was Produced
  2. Avoid Manual Data Manipulation Steps
  • “manual data manipulation” here includes copy/paste operations in tools like Excel.
  1. Archive the Exact Versions of All External Programs Used
  • in a laboratory context, this has equivalence to recording the manufacturer and batch number of commercial kits and reagents, and the manufacturer and model number of apparatus
  1. Always Store Raw Data behind Plots
  2. Connect Textual Statements to Underlying Results
  3. Provide Public Access to Data, Scripts, Runs, and Results
[Adam Savage](https://en.wikipedia.org/wiki/Adam_Savage) knows the difference. And now so do you.

Figure 1.1: Adam Savage knows the difference. And now so do you.

2 FAIR Principles

The first FAIR (Findability, Accessibility, Interoperability, and Reuse of digital assets) guidelines were published in 20164. Rather than focusing on sharing data among human researchers, the FAIR principles are intended to make it easier to share data in a way that makes them computationally interoperable and reusable. A key driver for this is the realisation that, for all the value in an individual scientific study, its impact and importance can be magnified greatly when integrated with other research outputs - either to establish replicability of a result, or to build a new, larger analysis.

For a scientific result to be usable in this way, it has to be:

  • findable: the resource must have a unique and persistent identifier, labelled with enough information (metadata) to explain the data it contains
  • accessible: the resource must be retrievable using its unique identifier, without payment
  • interoperable: the resource must use a formal, shared, and broadly applicable language for representing the knowledge it contains
    • this often means using one of the open, plain-text data standards and formats we discuss elsewhere
  • reusable: the data and metadata must have a clear licence for use, meet community standards, and be described with sufficient detail to be useful

It is almost certainly beyond the scope of your undergraduate project to make that work FAIR, but the principles of documenting your data and analysis well, and using open, standard file formats are universal. You can find out more about making your work FAIR at the link below:


  1. Begley, CA, & Ioannidis, JPA (2015) “Reproducibility in Science” Circulation Research 116:116–126↩︎

  2. National Academies of Sciences, Engineering, and Medicine; Policy and Global Affairs; Committee on Science, Engineering, Medicine, and Public Policy; Board on Research Data and Information; Division on Engineering and Physical Sciences; Committee on Applied and Theoretical Statistics; Board on Mathematical Sciences and Analytics; Division on Earth and Life Studies; Nuclear and Radiation Studies Board; Division of Behavioral and Social Sciences and Education; Committee on National Statistics; Board on Behavioral, Cognitive, and Sensory Sciences; Committee on Reproducibility and Replicability in Science. “Reproducibility and Replicability in Science”. Washington (DC): National Academies Press (US); 2019 May 7. “3, Understanding Reproducibility and Replicability”. Available from: https://www.ncbi.nlm.nih.gov/books/NBK547546/↩︎

  3. Sandve GK, Nekrutenko A, Taylor J, & Hovig E (2013) “Ten Simple Rules for Reproducible Computational Research” PLoS Computational Biology doi:10.1371/journal.pcbi.1003285↩︎

  4. Wilkinson M, Dumontier M, Aalbersberg I et al. “The FAIR Guiding Principles for scientific data management and stewardship”. Sci Data 3, 160018 (2016). doi:10.1038/sdata.2016.18↩︎