Scientific and medical research depends on us all as scientists generating robust and reliable new knowledge. With technological advances such as high-throughput assays and genomics, we are producing data and publications at a much faster rate than ever before. Research is increasingly interdisciplinary, and the questions and systems we investigate are ever more complex. The demands on time and effort to ensure the credibility of published science are huge, and it is essential for us as scientists to take personal responsibility to make our research reproducible and replicable1.
There are two distinct concepts encapsulated in the terms reproducibility and replicability:
The main difference is that concept in (1) does not require regeneration of data (or new experiments), but the concept in (2) does require this and is closer to the idea of independent verification.
Confusingly, some disciplines (e.g. biomedical science and statistics), refer to (1) as reproducibility and (2) as replicability, where others (including microbiology) flip the definitions and refer to (1) as replicability and (2) as reproducibility2.
Regardless of which definition is assigned to reproducibility or replicability, there is a growing culture of reproducible research in science, where the complete and/or cleaned dataset and the analysis pipeline or workflow used to generate a result are made available. This is intended to make research transparent so that it can be verified and built upon by other research teams.
Verification of existing results is an important part of ensuring the correctness of the scientific record, and can be considered a minimum standard for judging a scientific claim, but it is not sufficient in isolation to establish the “truth” of the result. A single well-conducted experiment establishes evidence for a scientific result, but this evidence is strengthened when the result can be replicated by multiple independent research teams, each generating their own independent data.
Both validation and independent verification benefit from the original work being transparent, with the original data being made available and the research protocol being clearly described. The 2013 paper by Sandve et al.3 describes ten simple rules for making computational research reproducible. These are focused on computational work, including data analysis, but some are general principles:
Excel
.The first FAIR (Findability, Accessibility, Interoperability, and Reuse of digital assets) guidelines were published in 20164. Rather than focusing on sharing data among human researchers, the FAIR principles are intended to make it easier to share data in a way that makes them computationally interoperable and reusable. A key driver for this is the realisation that, for all the value in an individual scientific study, its impact and importance can be magnified greatly when integrated with other research outputs - either to establish replicability of a result, or to build a new, larger analysis.
For a scientific result to be usable in this way, it has to be:
It is almost certainly beyond the scope of your undergraduate project to make that work FAIR, but the principles of documenting your data and analysis well, and using open, standard file formats are universal. You can find out more about making your work FAIR at the link below:
Begley, CA, & Ioannidis, JPA (2015) “Reproducibility in Science” Circulation Research 116:116–126↩︎
National Academies of Sciences, Engineering, and Medicine; Policy and Global Affairs; Committee on Science, Engineering, Medicine, and Public Policy; Board on Research Data and Information; Division on Engineering and Physical Sciences; Committee on Applied and Theoretical Statistics; Board on Mathematical Sciences and Analytics; Division on Earth and Life Studies; Nuclear and Radiation Studies Board; Division of Behavioral and Social Sciences and Education; Committee on National Statistics; Board on Behavioral, Cognitive, and Sensory Sciences; Committee on Reproducibility and Replicability in Science. “Reproducibility and Replicability in Science”. Washington (DC): National Academies Press (US); 2019 May 7. “3, Understanding Reproducibility and Replicability”. Available from: https://www.ncbi.nlm.nih.gov/books/NBK547546/↩︎
Sandve GK, Nekrutenko A, Taylor J, & Hovig E (2013) “Ten Simple Rules for Reproducible Computational Research” PLoS Computational Biology doi:10.1371/journal.pcbi.1003285↩︎
Wilkinson M, Dumontier M, Aalbersberg I et al. “The FAIR Guiding Principles for scientific data management and stewardship”. Sci Data 3, 160018 (2016). doi:10.1038/sdata.2016.18↩︎