Estimating multilocus linkage disequilibria

Heredity (Edinb). 2000 Mar:84 ( Pt 3):373-89. doi: 10.1046/j.1365-2540.2000.00683.x.

Abstract

The state of a diploid population segregating for two alleles at each of n loci is described by 22n genotype frequencies, or equivalently, by allele frequencies and by multilocus moments or cumulants of various orders. These measures of linkage disequilibrium cannot usually be determined, both because one cannot tell whether a gene came from the maternal or paternal gamete, and because such a large number of parameters cannot be estimated even from large samples. Simplifying assumptions must therefore be made. This paper sets out methods for estimating multilocus genotype frequencies which are appropriate for unlinked neutral loci, and for populations that are ultimately derived by mixing of two source populations. In such a hybrid population, all multilocus associations depend primarily on the number of loci involved that derive from the maternal genome, and the number derived from the paternal genome. Allele frequencies may differ across loci, and the contribution of each locus to multilocus associations may be scaled by the difference in allele frequency between source populations for that locus (deltap </= 1). For example, the cumulant describing the association between genes i, j, k from the maternal genome, and genes i, l from the paternal genome is kappai,j,k,i*l*, = deltapi2 deltapj deltapk deltapl kappa3,2. The state of the population is described by n allele frequencies; n divergences, deltap; and by a symmetric matrix of cumulants, kappaJ,K (J=0,ellipsis, n, K=0, ellipsis, n). Expressions for these cumulants under short- and long-range migration are given. Two methods for estimating the cumulants are described: a simple method based on multivariate moments, and a maximum likelihood procedure, which uses the Metropolis algorithm. Both methods perform well when tested against simulations with two or four loci.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Alleles
  • Animals
  • Anura / genetics
  • Cell Nucleus / genetics
  • Cytoplasm / metabolism
  • Databases, Factual
  • Gene Frequency
  • Genotype
  • Likelihood Functions
  • Linkage Disequilibrium*
  • Models, Genetic*
  • Monte Carlo Method
  • Recombination, Genetic