Highly expressed and slowly evolving proteins share compositional properties with thermophilic proteins

Mol Biol Evol. 2010 Mar;27(3):735-41. doi: 10.1093/molbev/msp270. Epub 2009 Nov 12.


The sequences of proteins encoded by a genome evolve at different rates. A correlate of a protein's evolutionary rate is its expression level: highly expressed proteins tend to evolve slowly. Some explanations of rate variation and the correlation between rate and expression predict that more slowly evolving and more highly expressed proteins have more favorable equilibrium constants for folding. Proteins from thermophiles generally have more stable folds than proteins from mesophiles, and it is known that there are systematic differences in amino acid content between thermophilic and mesophilic proteins. I examined whether there are analogous correlations of amino acid frequencies with evolutionary rate and expression level within genomes. In most of the organisms analyzed, there is a striking tendency for more slowly evolving proteins to be more thermophile-like in their amino acid compositions when adjustments are made for variation in GC content. More highly expressed proteins also tend to be more thermophile-like by the same criteria. These results suggest that part of the evolutionary rate variation among proteins is due to variation in the strength of selection for stability of the folded state. They also suggest that increasing strength of this selective force with expression level plays a role in the correlation between evolutionary rate and expression level.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Amino Acids / chemistry
  • Amino Acids / genetics
  • Animals
  • Archaeal Proteins / chemistry
  • Archaeal Proteins / genetics
  • Bacterial Proteins / chemistry
  • Bacterial Proteins / genetics
  • Base Composition
  • Evolution, Molecular*
  • Fungal Proteins / chemistry
  • Fungal Proteins / genetics
  • Hot Temperature
  • Humans
  • Normal Distribution
  • Proteins / chemistry
  • Proteins / genetics*
  • Regression Analysis
  • Statistics, Nonparametric


  • Amino Acids
  • Archaeal Proteins
  • Bacterial Proteins
  • Fungal Proteins
  • Proteins