Data Management in Computational Systems Biology: Exploring Standards, Tools, Databases, and Packaging Best Practices

Methods Mol Biol. 2019;2049:285-314. doi: 10.1007/978-1-4939-9736-7_17.

Abstract

Computational systems biology involves integrating heterogeneous datasets in order to generate models. These models can assist with understanding and prediction of biological phenomena. Generating datasets and integrating them into models involves a wide range of scientific expertise. As a result these datasets are often collected by one set of researchers, and exchanged with others researchers for constructing the models. For this process to run smoothly the data and models must be FAIR-findable, accessible, interoperable, and reusable. In order for data and models to be FAIR they must be structured in consistent and predictable ways, and described sufficiently for other researchers to understand them. Furthermore, these data and models must be shared with other researchers, with appropriately controlled sharing permissions, before and after publication. In this chapter we explore the different data and model standards that assist with structuring, describing, and sharing. We also highlight the popular standards and sharing databases within computational systems biology.

Keywords: Data storage; Databases; FAIR; Metadata; Model storage; Reproducible research; Standards.

MeSH terms

  • Computational Biology
  • Data Management / methods*
  • Databases, Factual
  • Systems Biology / methods*