The excess of small inverted repeats in prokaryotes

J Mol Evol. 2008 Sep;67(3):291-300. doi: 10.1007/s00239-008-9151-z. Epub 2008 Aug 12.

Abstract

Recent analyses have shown that there is a large excess of perfect inverted repeats in many prokaryotic genomes but not in eukaryotic ones. This difference could be due to a genuine difference between prokaryotes and eukaryotes or to differences in the methods and types of data analyzed--full genome versus protein coding sequences. We used simulations to show that the method used previously tends to underestimate the expected number of inverted repeats. However, this bias is not large and cannot explain the excess of inverted repeats observed in real data. In contrast, our method is unbiased. When both methods are applied to bacterial protein coding sequences they both detect an excess of inverted repeats, which is much lower than previously reported in whole prokaryotic genomes. This suggests that the reported large excess of inverted repeats is due to repeats found in intergenic regions. These repeats could be due to transcription factor binding sites, or other types of repetitive DNA, on opposite strands of the DNA sequence. In contrast, the smaller, but significant, excess of inverted repeats that we report in protein coding sequences may be due to sequence-directed mutagenesis (SDM). SDM is a process where one copy of a small, imperfect, inverted repeat corrects the other copy via strand misalignment, resulting in a perfect repeat and a series of mutations. We show by simulation that even very low levels of SDM, relative to the rate of point mutation, can generate a substantial excess of inverted repeats.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Genome / genetics
  • Mutagenesis, Site-Directed
  • Point Mutation / genetics
  • Prokaryotic Cells / metabolism*
  • Repetitive Sequences, Nucleic Acid / genetics*