Reproducibility of computational workflows is automated using continuous analysis
- PMID: 28288103
- PMCID: PMC6103790
- DOI: 10.1038/nbt.3780
Reproducibility of computational workflows is automated using continuous analysis
Abstract
Replication, validation and extension of experiments are crucial for scientific progress. Computational experiments are scriptable and should be easy to reproduce. However, computational analyses are designed and run in a specific computing environment, which may be difficult or impossible to match using written instructions. We report the development of continuous analysis, a workflow that enables reproducible computational analyses. Continuous analysis combines Docker, a container technology akin to virtual machines, with continuous integration, a software development technique, to automatically rerun a computational analysis whenever updates or improvements are made to source code or data. This enables researchers to reproduce results without contacting the study authors. Continuous analysis allows reviewers, editors or readers to verify reproducibility without manually downloading and rerunning code and can provide an audit trail for analyses of data that cannot be shared.
Conflict of interest statement
The authors have no competing financial interests to declare.
Figures
Similar articles
-
Building Containerized Workflows Using the BioDepot-Workflow-Builder.Cell Syst. 2019 Nov 27;9(5):508-514.e3. doi: 10.1016/j.cels.2019.08.007. Epub 2019 Sep 11. Cell Syst. 2019. PMID: 31521606 Free PMC article.
-
Bioinformatics recipes: creating, executing and distributing reproducible data analysis workflows.BMC Bioinformatics. 2020 Jul 8;21(1):292. doi: 10.1186/s12859-020-03602-6. BMC Bioinformatics. 2020. PMID: 32640986 Free PMC article.
-
GUIdock: Using Docker Containers with a Common Graphics User Interface to Address the Reproducibility of Research.PLoS One. 2016 Apr 5;11(4):e0152686. doi: 10.1371/journal.pone.0152686. eCollection 2016. PLoS One. 2016. PMID: 27045593 Free PMC article.
-
Reproducible, scalable, and shareable analysis pipelines with bioinformatics workflow managers.Nat Methods. 2021 Oct;18(10):1161-1168. doi: 10.1038/s41592-021-01254-9. Epub 2021 Sep 23. Nat Methods. 2021. PMID: 34556866 Review.
-
Workflow based framework for life science informatics.Comput Biol Chem. 2007 Oct;31(5-6):305-19. doi: 10.1016/j.compbiolchem.2007.08.009. Epub 2007 Aug 19. Comput Biol Chem. 2007. PMID: 17931570 Review.
Cited by
-
Open-source analytical pipeline for robust data analysis, visualizations and sharing in crop breeding.Plant Methods. 2022 Feb 5;18(1):14. doi: 10.1186/s13007-022-00845-7. Plant Methods. 2022. PMID: 35123539 Free PMC article.
-
s ·nr: a visual analytics framework for contextual analyses of private and public RNA-seq data.BMC Genomics. 2019 Jan 24;20(1):85. doi: 10.1186/s12864-018-5396-0. BMC Genomics. 2019. PMID: 30678634 Free PMC article.
-
Aether: leveraging linear programming for optimal cloud computing in genomics.Bioinformatics. 2018 May 1;34(9):1565-1567. doi: 10.1093/bioinformatics/btx787. Bioinformatics. 2018. PMID: 29228186 Free PMC article.
-
Improving the usability and archival stability of bioinformatics software.Genome Biol. 2019 Feb 27;20(1):47. doi: 10.1186/s13059-019-1649-8. Genome Biol. 2019. PMID: 30813962 Free PMC article.
-
Sci-Hub provides access to nearly all scholarly literature.Elife. 2018 Mar 1;7:e32822. doi: 10.7554/eLife.32822. Elife. 2018. PMID: 29424689 Free PMC article.
References
-
- McNutt M. Reproducibility. Science (80-) 2014;343(6168):229. http://science.sciencemag.org/content/343/6168/229.abstract. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
