A database of domain definitions for proteins with complex interdomain geometry

PLoS One. 2009;4(4):e5084. doi: 10.1371/journal.pone.0005084. Epub 2009 Apr 8.

Abstract

Protein structural domains are necessary for understanding evolution and protein folding, and may vary widely from functional and sequence based domains. Although, various structural domain databases exist, defining domains for some proteins is non-trivial, and definitions of their domain boundaries are not available. Here, we present a novel database of manually defined structural domains for a representative set of proteins from the SCOP "multi-domain proteins" class. (http://prodata.swmed.edu/multidom/). We consider our domains as mobile evolutionary units, which may rearrange during protein evolution. Additionally, they may be visualized as structurally compact and possibly independently folding units. We also found that representing domains as evolutionary and folding units do not always lead to a unique domain definition. However, unlike existing databases, we retain and refine these "alternate" domain definitions after careful inspection of structural similarity, functional sites and automated domain definition methods. We provide domain definitions, including actual residue boundaries, for proteins that well known databases like SCOP and CATH do not attempt to split. Our alternate domain definitions are suitable for sequence and structure searches by automated methods. Additionally, the database can be used for training and testing domain delineation algorithms. Since our domains represent structurally compact evolutionary units, the database may be useful for studying domain properties and evolution.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Automation
  • DNA-Directed DNA Polymerase / chemistry
  • DNA-Directed DNA Polymerase / metabolism
  • DNA-Directed RNA Polymerases / chemistry
  • DNA-Directed RNA Polymerases / metabolism
  • Databases, Protein*
  • Evolution, Molecular
  • Models, Molecular
  • Protein Folding
  • Proteins / chemistry*

Substances

  • Proteins
  • DNA-Directed RNA Polymerases
  • DNA-Directed DNA Polymerase