Databases, Repositories, and Other Data Resources in Structural Biology

Methods Mol Biol. 2017;1607:643-665. doi: 10.1007/978-1-4939-7000-1_27.

Abstract

Structural biology, like many other areas of modern science, produces an enormous amount of primary, derived, and "meta" data with a high demand on data storage and manipulations. Primary data come from various steps of sample preparation, diffraction experiments, and functional studies. These data are not only used to obtain tangible results, like macromolecular structural models, but also to enrich and guide our analysis and interpretation of various biomedical problems. Herein we define several categories of data resources, (a) Archives, (b) Repositories, (c) Databases, and (d) Advanced Information Systems, that can accommodate primary, derived, or reference data. Data resources may be used either as web portals or internally by structural biology software. To be useful, each resource must be maintained, curated, as well as integrated with other resources. Ideally, the system of interconnected resources should evolve toward comprehensive "hubs", or Advanced Information Systems. Such systems, encompassing the PDB and UniProt, are indispensable not only for structural biology, but for many related fields of science. The categories of data resources described herein are applicable well beyond our usual scientific endeavors.

Keywords: Archive; Data resource; Database; Information system; Metadata; Repository; Structural biology.

Publication types

  • Review

MeSH terms

  • Computational Biology / methods*
  • Crystallography, X-Ray / methods
  • Databases, Protein / statistics & numerical data*
  • Information Storage and Retrieval / methods
  • Information Storage and Retrieval / statistics & numerical data*
  • Internet
  • Macromolecular Substances / chemistry
  • Macromolecular Substances / ultrastructure*
  • Microscopy, Electron / methods
  • Models, Molecular
  • Protein Conformation
  • Proteins / chemistry
  • Proteins / ultrastructure*
  • Software
  • Stereoisomerism

Substances

  • Macromolecular Substances
  • Proteins