Coarse-graining protein energetics in sequence variables

Phys Rev Lett. 2005 Sep 30;95(14):148103. doi: 10.1103/PhysRevLett.95.148103. Epub 2005 Sep 29.


We show that cluster expansions (CE), previously used to model solid-state materials with binary or ternary configurational disorder, can be extended to the protein design problem. We present a generalized CE framework, in which properties such as energy can be unambiguously expanded in the amino-acid sequence space. The CE coarse grains over nonsequence degrees of freedom (e.g., side-chain conformations) and thereby simplifies the problem of designing proteins, or predicting the compatibility of a sequence with a given structure, by many orders of magnitude. The CE is physically transparent, and can be evaluated through linear regression on the energies of training sequences. We show, as example, that good prediction accuracy is obtained with up to pairwise interactions for a coiled-coil backbone, and that triplet interactions are important in the energetics of a more globular zinc-finger backbone.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Biophysical Phenomena
  • Biophysics
  • Cluster Analysis
  • Computational Biology
  • DNA / chemistry
  • Linear Models
  • Models, Molecular
  • Models, Statistical
  • Protein Conformation
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • RNA / chemistry
  • Thermodynamics
  • Zinc Fingers


  • Proteins
  • RNA
  • DNA