Probability of a segregating pattern in a sample of DNA sequences

Theor Popul Biol. 1998 Aug;54(1):1-10. doi: 10.1006/tpbi.1997.1359.


Mutations that result in segregating sites (polymorphic sites) in a sample of DNA sequences can be classified into different types. A pattern of segregating sites is an array of the numbers of various types of mutations. Using an urn model, the probability of a pattern of segregating sites can be expressed as a recurrence equation and its value can be computed sequentially. Among those that can be computed by this method are the probability of obtaining k external mutations (mutations that occur in external branches of the genealogy of a sample), the probability of obtaining k internal mutations (mutations that occur in internal branches), the probability of obtaining k singletons (segregating sites at which one of the two segregating nucleotides is present in only one sequence), and the probability of obtaining k non-singletons. Two applications of the method are discussed. One is a maximum likelihood estimation of straight theta and another is a Bayesian statistical test of the hypothesis of neutral mutations.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Bayes Theorem
  • DNA Mutational Analysis / statistics & numerical data*
  • Humans
  • Likelihood Functions
  • Models, Genetic
  • Polymorphism, Single-Stranded Conformational*