Correlated evolutionary pressure at interacting transcription factors and DNA response elements can guide the rational engineering of DNA binding specificity

J Mol Biol. 2005 Jul 15;350(3):402-15. doi: 10.1016/j.jmb.2005.04.054.


Understanding the molecular mechanisms of the specific interaction between transcription factor proteins and DNA is key to comprehend the regulation of gene expression and to develop technologies to engineer transcription factors. Thus far, although there have been several attempts to elucidate protein-DNA interaction through amino acid-base recognition codes, sequence based profiles, or physical models of interaction, the greatest successes in engineering DNA binding specificity remain experimental. Here we present the first systematic evidence of correlated evolutionary pressure at interacting amino acid residues and DNA base-pairs in transcription factors, and show that it can be used to rationally engineer DNA binding specificity. The correlation is between the relative evolutionary importance of protein residues and DNA bases, measured, respectively, in terms of the Evolutionary Trace (ET) rank and information entropy. The evolutionarily most important residues interact with the most conserved base-pairs within the response element while residues of least importance interact with the most variable base-pairs. The correlation averages 0.74 over 12 unrelated families of transcriptional regulators, including nuclear hormone receptors, basic helix-loop-helix, ETS- and homeo-domain family. To test the predictive power of this correlation, we targeted a mutational swap of top-ranked ET residues in a transcription factor, LRH-1. This redirects LRH-1 binding as predicted and showed that, in this case, evolutionary importance and binding specificity are coupled sufficiently strongly for the Evolutionary Trace to guide the computational design of DNA binding specificity. This establishes the existence of evolutionary importance correlation at protein-DNA interfaces, and demonstrates that it is a useful principle for the rational engineering of binding specificity.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Biological Evolution*
  • Computational Biology
  • DNA / chemistry
  • DNA / genetics*
  • DNA / metabolism
  • DNA Mutational Analysis
  • DNA-Binding Proteins / chemistry
  • Entropy
  • Evolution, Molecular
  • Genomics / methods
  • Humans
  • Models, Genetic
  • Models, Statistical
  • Mutation
  • Nucleic Acid Conformation
  • Phylogeny
  • Protein Binding
  • Protein Engineering / methods*
  • Receptors, Cytoplasmic and Nuclear / chemistry
  • Receptors, Estrogen / metabolism
  • Response Elements
  • Software
  • Thermodynamics
  • Transcription Factors / chemistry
  • Transcription Factors / metabolism*


  • DNA-Binding Proteins
  • NR5A2 protein, human
  • Receptors, Cytoplasmic and Nuclear
  • Receptors, Estrogen
  • Transcription Factors
  • DNA