Ewens' sampling formula and related formulae: combinatorial proofs, extensions to variable population size and applications to ages of alleles

Theor Popul Biol. 2005 Nov;68(3):167-77. doi: 10.1016/j.tpb.2005.02.004.

Abstract

Ewens' sampling formula, the probability distribution of a configuration of alleles in a sample of genes under the infinitely-many-alleles model of mutation, is proved by a direct combinatorial argument. The distribution is extended to a model where the population size may vary back in time. The distribution of age-ordered frequencies in the population is also derived in the model, extending the GEM distribution of age-ordered frequencies in a model with a constant-sized population. The genealogy of a rare allele is studied using a combinatorial approach. A connection is explored between the distribution of age-ordered frequencies and ladder indices and heights in a sequence of random variables. In a sample of n genes the connection is with ladder heights and indices in a sequence of draws from an urn containing balls labelled 1,2,...,n; and in the population the connection is with ladder heights and indices in a sequence of independent uniform random variables.

MeSH terms

  • Alleles*
  • Models, Genetic*
  • Mutation
  • Population Density*
  • Time Factors