Extracting hydrogen-bond signature patterns from protein structure data

Appl Bioinformatics. 2004;3(2-3):125-35. doi: 10.2165/00822942-200403020-00007.


Classification of protein sequences and structures into families is a fundamental task in biology, and it is often used as a basis for designing experiments for gaining further knowledge. Some relationships between proteins are detected by the similarities in their sequences, and many more by the similarities in their structures. Despite this, there are a number of examples of functionally similar molecules without any recognisable sequence or structure similarities, and there are also a number of protein molecules that share common structural scaffolds but exhibit different functions. Newer methods of comparing molecules are required in order to detect similarities and dissimilarities in protein molecules. In this article, it is proposed that the precise 3-dimensional disposition of key residues in a protein molecule is what matters for its function, or what conveys the "meaning" for a biological system, but not what means it uses to achieve this. The concept of comparing two molecules through their intramolecular interaction networks is explored, since these networks dictate the disposition of amino acids in a protein structure. First, signature patterns, or fingerprints, of interaction networks in pre-classified protein structural families are computed using an approach to find structural equivalences and consensus hydrogen bonds. Five examples from different structural classes are illustrated. These patterns are then used to search the entire Protein Data Bank, an approach through which new, unexpected similarities have been found. The potential for finding relationships through this approach is highlighted. The use of hydrogen-bond fingerprints as a new metric for measuring similarities in protein structures is also described.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Amino Acids / chemistry*
  • Binding Sites
  • Conserved Sequence
  • Hydrogen / chemistry*
  • Hydrogen Bonding
  • Molecular Sequence Data
  • Peptide Mapping / methods*
  • Protein Binding
  • Proteins / chemistry*
  • Proteins / classification
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Sequence Homology, Amino Acid
  • Structure-Activity Relationship


  • Amino Acids
  • Proteins
  • Hydrogen