Alignment and searching for common protein folds using a data bank of structural templates

J Mol Biol. 1993 Jun 5;231(3):735-52. doi: 10.1006/jmbi.1993.1323.

Abstract

We introduce an approach to protein comparisons in which tertiary-structure information is exploited in the alignment of a protein sequence of known tertiary structure, or an aligned set of sequences of known homologous structures, with one or more sequences. The local tertiary environments of residues in the one or more three-dimensional structures (defined in terms of residue accessibility to solvent, secondary structure and hydrogen bonding) are used to select position-specific amino acid substitution scores and produce a scoring template suitable for aligning sequences or searching sequence data banks. The amino acid substitution scores have been accumulated from 72 families of protein structures in which the observed substitutions have been classified according to features of the local structure. Hence, the value attributed to a particular amino acid interchange in the template is not a constant, but is dependent upon the environmental context in which that substitution has occurred. We have used these structural templates to align proteins, as well as to search an amino acid sequence data bank for proteins having a similar fold. Indeed, a database of templates that corresponds to both unique structures and aligned homologous structures from the Brookhaven Protein Data Bank has been produced. A new sequence can be searched against the database of templates in order to identify a similar tertiary fold even if the sequence is not significantly similar to any proteins of known three-dimensional structure.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Amino Acids / chemistry
  • Animals
  • Databases, Factual*
  • Humans
  • Hydrogen Bonding
  • Information Storage and Retrieval*
  • Molecular Sequence Data
  • Protein Folding*
  • Protein Structure, Tertiary*
  • Sequence Alignment* / methods
  • Sequence Homology, Amino Acid
  • Templates, Genetic
  • X-Ray Diffraction

Substances

  • Amino Acids

Associated data

  • PDB/UNKNOWN