Parser for Protein Folding Units

Proteins. 1994 Jul;19(3):256-68. doi: 10.1002/prot.340190309.


General patterns of protein structural organization have emerged from studies of hundreds of structures elucidated by X-ray crystallography and nuclear magnetic resonance. Structural units are commonly identified by visual inspection of molecular models using qualitative criteria. Here, we propose an algorithm for identification of structural units by objective, quantitative criteria based on atomic interactions. The underlying physical concept is maximal interactions within each unit and minimal interaction between units (domains). In a simple harmonic approximation, interdomain dynamics is determined by the strength of the interface and the distribution of masses. The most likely domain decomposition involves units with the most correlated motion, or largest interdomain fluctuation time. The decomposition of a convoluted 3-D structure is complicated by the possibility that the chain can cross over several times between units. Grouping the residues by solving an eigenvalue problem for the contact matrix reduces the problem to a one-dimensional search for all reasonable trial bisections. Recursive bisection yields a tree of putative folding units. Simple physical criteria are used to identify units that could exist by themselves. The units so defined closely correspond to crystallographers' notion of structural domains. The results are useful for the analysis of folding principles, for modular protein design and for protein engineering.

Publication types

  • Comparative Study

MeSH terms

  • Actins / chemistry
  • Algorithms
  • Binding Sites
  • Computer Simulation*
  • Flavin-Adenine Dinucleotide / chemistry
  • Mathematical Computing
  • Models, Chemical*
  • Protein Denaturation
  • Protein Structure, Tertiary*
  • Serine Endopeptidases / chemistry


  • Actins
  • Flavin-Adenine Dinucleotide
  • Serine Endopeptidases