Measures of variation at DNA repeat loci under a general stepwise mutation model

Theor Popul Biol. 1996 Dec;50(3):345-67. doi: 10.1006/tpbi.1996.0035.


Polymorphisms at tandem repeat loci are caused by mutations with allele sizes occasionally altered by more than one repeat unit in both forward and backward directions. Such mutational changes may occur with asymmetric probabilities. Therefore, a one-step symmetric stepwise mutation model may not be appropriate for studying the population dynamics at all repeat loci. In this work, we evaluated the expectation and variance of the within-population variance of the allele size distribution in a finite population, and the expected homozygosity at a locus by the coalescence approach under a general stepwise mutation model, where mutational transitions of allele sizes can be arbitrary, including being asymmetric. Under the special cases of symmetric one-step, two-step, and multi-step geometric distributions of mutations, our general results reduce to the corresponding results obtained by earlier investigators. The general results indicate that in a finite population, which has reached a steady state under the (general stepwise) mutation and drift balance, the within-population variance of allele sizes has a simple expectation (i.e., proportional to Nnu, the product of the mutation rate, nu, and effective population size, N). However, its stochastic variance is a quadratic function of this composite parameter, Nnu. Furthermore, this second-order variance does not decay with the number of alleles sampled from a population. Application of this theory to data on allele size distributions in unrelated Caucasians from the CEPH pedigree (obtained from the Genome Data Base) shows that the relationship of the variance and mean of within-population variance of allele sizes at tandem repeat loci, grouped by their chromosomal assignment, has a trend compatible with the theory. However, there is an indication that the second-order variance is generally underestimated. One reason for this departure might be that the CEPH sample may not represent a single homogeneous population that reached equilibrium at all tandem repeat loci.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • DNA / genetics*
  • European Continental Ancestry Group / genetics
  • Genetic Variation*
  • Genome, Human
  • Homozygote
  • Humans
  • Models, Genetic*
  • Mutation / genetics*
  • Pedigree
  • Polymorphism, Genetic / genetics*
  • Repetitive Sequences, Nucleic Acid / genetics*


  • DNA