Structural genomics and the Protein Data Bank

J Biol Chem. 2021 Jan-Jun;296:100747. doi: 10.1016/j.jbc.2021.100747. Epub 2021 May 3.


The field of Structural Genomics arose over the last 3 decades to address a large and rapidly growing divergence between microbial genomic, functional, and structural data. Several international programs took advantage of the vast genomic sequence information and evaluated the feasibility of structure determination for expanded and newly discovered protein families. As a consequence, structural genomics has developed structure-determination pipelines and applied them to a wide range of novel, uncharacterized proteins, often from "microbial dark matter," and later to proteins from human pathogens. Advances were especially needed in protein production and rapid de novo structure solution. The experimental three-dimensional models were promptly made public, facilitating structure determination of other members of the family and helping to understand their molecular and biochemical functions. Improvements in experimental methods and databases resulted in fast progress in molecular and structural biology. The Protein Data Bank structure repository played a central role in the coordination of structural genomics efforts and the structural biology community as a whole. It facilitated development of standards and validation tools essential for maintaining high quality of deposited structural data.

Keywords: Protein Data Bank; X-ray crystallography; databases; structural biology; structural genomics.

Publication types

  • Historical Article
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Review

MeSH terms

  • Animals
  • Computational Biology / history*
  • Databases, Protein
  • Genomics / history*
  • History, 20th Century
  • History, 21st Century
  • Humans
  • Models, Molecular*