Reproducibility of computational workflows is automated using continuous analysis

Nat Biotechnol. 2017 Apr;35(4):342-346. doi: 10.1038/nbt.3780. Epub 2017 Mar 13.

Abstract

Replication, validation and extension of experiments are crucial for scientific progress. Computational experiments are scriptable and should be easy to reproduce. However, computational analyses are designed and run in a specific computing environment, which may be difficult or impossible to match using written instructions. We report the development of continuous analysis, a workflow that enables reproducible computational analyses. Continuous analysis combines Docker, a container technology akin to virtual machines, with continuous integration, a software development technique, to automatically rerun a computational analysis whenever updates or improvements are made to source code or data. This enables researchers to reproduce results without contacting the study authors. Continuous analysis allows reviewers, editors or readers to verify reproducibility without manually downloading and rerunning code and can provide an audit trail for analyses of data that cannot be shared.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Computational Biology / methods*
  • Gene Expression Profiling / methods*
  • Machine Learning*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Software*
  • User-Computer Interface*
  • Workflow*