The majority of recent short DNA insertions in the human genome are tandem duplications

Mol Biol Evol. 2007 May;24(5):1190-7. doi: 10.1093/molbev/msm035. Epub 2007 Feb 24.

Abstract

Nucleotide substitutions, insertions, and deletions constitute the principal molecular mechanisms generating genetic variation on small length scales. In contrast to substitutions, the nature of short DNA insertions and deletions (indels) is far less understood. With the recent availability of whole-genome multiple alignments between human and other primates, detailed investigations on indel characteristics and origin have come within reach. Here, we show that the majority of short (1-100 bp) DNA insertions in the human lineage are tandem duplications of directly adjacent sequence segments with conserved polarity. Indels in microsatellites comprise only a small fraction. The underlying molecular processes generating indels do not necessarily rely on the presence of preexisting duplicates, as would be expected for unequal crossing over, as well as replication slippage. Instead, our findings point toward a mechanism that preferentially occurs in the male germline and is not recombination-mediated. Surprisingly, nonframeshifting tandem duplications and deletions in coding regions still occur at approximately 50% of their genomic background rates. As is already well established in the context of gene and segmental duplications, our results demonstrate that duplications are also likely to constitute the predominant process for rapid generation of new genetic material and function on smaller scales.

MeSH terms

  • Animals
  • DNA*
  • Evolution, Molecular
  • Gene Duplication*
  • Genome, Human*
  • Humans
  • Macaca mulatta / genetics
  • Male
  • Microsatellite Repeats
  • Pan troglodytes / genetics
  • Sequence Alignment
  • Tandem Repeat Sequences / genetics*

Substances

  • DNA