The effects of selection against spurious transcription factor binding sites

Mol Biol Evol. 2003 Jun;20(6):901-6. doi: 10.1093/molbev/msg096. Epub 2003 Apr 25.


Most genomes contain nucleotide sequences with no known function; such sequences are assumed to be free of constraints, evolving only according to the vagaries of mutation. Here we show that selection acts to remove spurious transcription factor binding site motifs throughout 52 fully sequenced genomes of Eubacteria and Archaea. Examining the sequences necessary for polymerase binding, we find that spurious binding sites are underrepresented in both coding and noncoding regions. The average proportion of spurious binding sites found relative to the expected is 80% in eubacterial genomes and 89% in archaeal genomes. We also estimate the strength of selection against spurious binding sites in the face of the constant creation of new binding sites via mutation. Under conservative assumptions, we estimate that selection is weak, with the average efficacy of selection against spurious binding sites, Nes, of -0.12 for eubacterial genomes and -0.06 for archaeal genomes, similar to that of codon bias. Our results suggest that both coding and noncoding sequences are constrained by selection to avoid specific regions of sequence space.

MeSH terms

  • Archaea / genetics
  • Bacteria / genetics
  • Binding Sites
  • Promoter Regions, Genetic
  • Selection, Genetic*
  • Transcription Factors / metabolism*


  • Transcription Factors