Abundance, distribution, and mutation rates of homopolymeric nucleotide runs in the genome of Caenorhabditis elegans

J Mol Evol. 2004 May;58(5):584-95. doi: 10.1007/s00239-004-2580-4.


Homopolymeric nucleotide runs, also called mononucleotide microsatellites, are a ubiquitous, dominant, and mutagenic feature of eukaryotic genomes. A clear understanding of the forces that shape patterns of homopolymer evolution, however, is lacking. We provide a focused investigation of the abundance, chromosomal distribution, and mutation spectra of the four strand-specific homopolymer types (A, T, G, C) >or=8 bp in the genome of Caenorhabditis elegans. A and T homopolymers vastly outnumber G and C HPs, and the run-length distributions of A and T homopolymers differ significantly from G and C homopolymers. A scanning window analysis of homopolymer chromosomal distribution reveals distinct clusters of homopolymer density in autosome arms that are regions of high recombination in C. elegans. Dramatic biases are detected among closely spaced homopolymers; for instance, we observe 994 A homopolymers immediately followed by a T homopolymer (5' to 3') and only 8 instances of T homopolymers directly followed by an A homopolymer. Empirical homopolymer mutation assays in a set of C. elegans mutation-accumulation lines reveal an approximately 20-fold higher mutation rate for G and C homopolymers compared to A and T homopolymers. Nuclear A and T homopolymers are also found to mutate approximately 100-fold more slowly than mitochondrial A and T homopolymers. This integrative approach yields a total nuclear genome-wide homopolymer mutation rate estimate of approximately 1.6 mutations per genome per generation.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Base Sequence
  • Caenorhabditis elegans / genetics*
  • DNA Mutational Analysis
  • Evolution, Molecular*
  • Genome*
  • Microsatellite Repeats / genetics*
  • Molecular Sequence Data
  • Mutation / genetics*
  • Recombination, Genetic / genetics

Associated data

  • GENBANK/AY219759
  • GENBANK/AY219760
  • GENBANK/AY219761
  • GENBANK/AY219762
  • GENBANK/AY219763
  • GENBANK/AY219764
  • GENBANK/AY219765
  • GENBANK/AY219766
  • GENBANK/AY219767
  • GENBANK/AY219768
  • GENBANK/AY219769
  • GENBANK/AY219770
  • GENBANK/AY219771
  • GENBANK/AY219772
  • GENBANK/AY219773
  • GENBANK/AY219774
  • GENBANK/AY219775
  • GENBANK/AY219776
  • GENBANK/AY219777
  • GENBANK/AY219778
  • GENBANK/AY219779
  • GENBANK/AY219780
  • GENBANK/AY219781
  • GENBANK/AY219782
  • GENBANK/AY219783
  • GENBANK/AY219784
  • GENBANK/AY219785
  • GENBANK/AY219786
  • GENBANK/AY219787
  • GENBANK/AY219788
  • GENBANK/AY219789