Genomic determinants of protein folding thermodynamics in prokaryotic organisms

J Mol Biol. 2004 Nov 5;343(5):1451-66. doi: 10.1016/j.jmb.2004.08.086.


Here we investigate how thermodynamic properties of orthologous proteins are influenced by the genomic environment in which they evolve. We performed a comparative computational study of 21 protein families in 73 prokaryotic species and obtained the following main results. (i) Protein stability with respect to the unfolded state and with respect to misfolding are anticorrelated. There appears to be a trade-off between these two properties, which cannot be optimized simultaneously. (ii) Folding thermodynamic parameters are strongly correlated with two genomic features, genome size and G+C composition. In particular, the normalized energy gap, an indicator of folding efficiency in statistical mechanical models of protein folding, is smaller in proteins of organisms with a small genome size and a compositional bias towards A+T. Such genomic features are characteristic for bacteria with an intracellular lifestyle. We interpret these correlations in light of mutation pressure and natural selection. A mutational bias toward A+T at the DNA level translates into a mutational bias toward more hydrophobic (and in general more interactive) proteins, a consequence of the structure of the genetic code. Increased hydrophobicity renders proteins more stable against unfolding but less stable against misfolding. Proteins with high hydrophobicity and low stability against misfolding occur in organisms with reduced genomes, like obligate intracellular bacteria. We argue that they are fixed because these organisms experience weaker purifying selection due to their small effective population sizes. This interpretation is supported by the observation of a high expression level of chaperones in these bacteria. Our results indicate that the mutational spectrum of a genome and the strength of selection significantly influence protein folding thermodynamics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Archaea / genetics*
  • Archaea / metabolism
  • Archaeal Proteins / genetics
  • Archaeal Proteins / metabolism
  • Bacteria / genetics*
  • Bacteria / metabolism
  • Bacterial Proteins / genetics
  • Bacterial Proteins / metabolism
  • Computational Biology
  • Genome, Archaeal
  • Genome, Bacterial
  • Hydrophobic and Hydrophilic Interactions
  • Models, Molecular
  • Principal Component Analysis
  • Protein Denaturation
  • Protein Folding*
  • Sequence Homology, Amino Acid
  • Thermodynamics*


  • Archaeal Proteins
  • Bacterial Proteins