Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Sep;180(1):533-45.
doi: 10.1534/genetics.108.089409. Epub 2008 Aug 30.

Correlation-based inference for linkage disequilibrium with multiple alleles

Affiliations

Correlation-based inference for linkage disequilibrium with multiple alleles

Dmitri V Zaykin et al. Genetics. 2008 Sep.

Abstract

The correlation between alleles at a pair of genetic loci is a measure of linkage disequilibrium. The square of the sample correlation multiplied by sample size provides the usual test statistic for the hypothesis of no disequilibrium for loci with two alleles and this relation has proved useful for study design and marker selection. Nevertheless, this relation holds only in a diallelic case, and an extension to multiple alleles has not been made. Here we introduce a similar statistic, R2, which leads to a correlation-based test for loci with multiple alleles: for a pair of loci with k and m alleles, and a sample of n individuals, the approximate distribution of n(k - 1)(m - 1)/(km)R2 under independence between loci is chi2(k-1(m-1). One advantage of this statistic is that it can be interpreted as the total correlation between a pair of loci. When the phase of two-locus genotypes is known, the approach is equivalent to a test for the overall correlation between rows and columns in a contingency table. In the phase-known case, R2 is the sum of the squared sample correlations for all km 2 x 2 subtables formed by collapsing to one allele vs. the rest at each locus. We examine the approximate distribution under the null of independence for R2 and report its close agreement with the exact distribution obtained by permutation. The test for independence using R2 is a strong competitor to approaches such as Pearson's chi square, Fisher's exact test, and a test based on Cressie and Read's power divergence statistic. We combine this approach with our previous composite-disequilibrium measures to address the case when the genotypic phase is unknown. Calculation of the new multiallele test statistic and its P-value is very simple and utilizes the approximate distribution of R2. We provide a computer program that evaluates approximate as well as "exact" permutational P-values.

PubMed Disclaimer

Figures

F<sc>igure</sc> 1.—
Figure 1.—
(a) Plots of T2 P-values against the Tp P-values for the known haplotype phase simulations. (b) Plots of T2 P-values against the Tp P-values for the unknown haplotype phase simulations. (c) Plots of T2 P-values against Pearson's χ2 P-values for the known haplotype phase simulations.
F<sc>igure</sc> 2.—
Figure 2.—
(a) Plots of T2 P-values against T1 P-values for the known haplotype phase simulations. (b) Plots of T2 P-values against T1 P-values for the unknown haplotype phase simulations.

Similar articles

Cited by

References

    1. Boos, D. D., and J. Zhang, 2000. Monte Carlo evaluation of resampling-based hypothesis tests. J. Am. Stat. Assoc. 95 486–492.
    1. Box, G. E. P., 1954. Some theorems on quadratic forms applied in the study of analysis of variance problems, II. Effect of inequality of variance in the two-way classification. Ann. Math. Stat. 25 290–302.
    1. Cressie, N., and T. R. C. Read, 1984. Multinomial goodness-of-fit tests. J. R. Stat. Soc. B 46 440–464.
    1. Evett, I. W., and B. S. Weir, 1998. Interpreting DNA Evidence. Sinauer Associates, Sunderland, MA.
    1. Excoffier, L., and M. Slatkin, 1995. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol. Biol. Evol. 12 921–927. - PubMed

Publication types