Correlation-based inference for linkage disequilibrium with multiple alleles
- PMID: 18757931
- PMCID: PMC2535703
- DOI: 10.1534/genetics.108.089409
Correlation-based inference for linkage disequilibrium with multiple alleles
Abstract
The correlation between alleles at a pair of genetic loci is a measure of linkage disequilibrium. The square of the sample correlation multiplied by sample size provides the usual test statistic for the hypothesis of no disequilibrium for loci with two alleles and this relation has proved useful for study design and marker selection. Nevertheless, this relation holds only in a diallelic case, and an extension to multiple alleles has not been made. Here we introduce a similar statistic, R2, which leads to a correlation-based test for loci with multiple alleles: for a pair of loci with k and m alleles, and a sample of n individuals, the approximate distribution of n(k - 1)(m - 1)/(km)R2 under independence between loci is chi2(k-1(m-1). One advantage of this statistic is that it can be interpreted as the total correlation between a pair of loci. When the phase of two-locus genotypes is known, the approach is equivalent to a test for the overall correlation between rows and columns in a contingency table. In the phase-known case, R2 is the sum of the squared sample correlations for all km 2 x 2 subtables formed by collapsing to one allele vs. the rest at each locus. We examine the approximate distribution under the null of independence for R2 and report its close agreement with the exact distribution obtained by permutation. The test for independence using R2 is a strong competitor to approaches such as Pearson's chi square, Fisher's exact test, and a test based on Cressie and Read's power divergence statistic. We combine this approach with our previous composite-disequilibrium measures to address the case when the genotypic phase is unknown. Calculation of the new multiallele test statistic and its P-value is very simple and utilizes the approximate distribution of R2. We provide a computer program that evaluates approximate as well as "exact" permutational P-values.
Figures
Similar articles
-
Testing for genetic association: a powerful score test.Stat Med. 2008 Sep 30;27(22):4596-609. doi: 10.1002/sim.3328. Stat Med. 2008. PMID: 18551534
-
Linkage disequilibrium testing when linkage phase is unknown.Genetics. 2004 Jan;166(1):505-12. doi: 10.1534/genetics.166.1.505. Genetics. 2004. PMID: 15020439 Free PMC article.
-
Power studies for the transmission/disequilibrium tests with multiple alleles.Am J Hum Genet. 1997 Mar;60(3):691-702. Am J Hum Genet. 1997. PMID: 9042930 Free PMC article.
-
On selecting markers for association studies: patterns of linkage disequilibrium between two and three diallelic loci.Genet Epidemiol. 2003 Jan;24(1):57-67. doi: 10.1002/gepi.10217. Genet Epidemiol. 2003. PMID: 12508256 Review.
-
Drawing inferences about the coancestry coefficient.Theor Popul Biol. 2009 Jun;75(4):312-9. doi: 10.1016/j.tpb.2009.03.005. Epub 2009 Apr 2. Theor Popul Biol. 2009. PMID: 19345237 Free PMC article. Review.
Cited by
-
Chromosomal rearrangements maintain a polymorphic supergene controlling butterfly mimicry.Nature. 2011 Aug 14;477(7363):203-6. doi: 10.1038/nature10341. Nature. 2011. PMID: 21841803 Free PMC article.
-
Genetic population structure and relatedness in the narrow-striped mongoose (Mungotictis decemlineata), a social Malagasy carnivore with sexual segregation.Ecol Evol. 2016 May 5;6(11):3734-3749. doi: 10.1002/ece3.2123. eCollection 2016 Jun. Ecol Evol. 2016. PMID: 27231532 Free PMC article.
-
Genome-wide linkage disequilibrium in nine-spined stickleback populations.G3 (Bethesda). 2014 Aug 12;4(10):1919-29. doi: 10.1534/g3.114.013334. G3 (Bethesda). 2014. PMID: 25122668 Free PMC article.
-
Association of cystic fibrosis transmembrane conductance regulator gene variants with acute lung injury in African American children with pneumonia*.Crit Care Med. 2012 Nov;40(11):3042-9. doi: 10.1097/CCM.0b013e31825d8f73. Crit Care Med. 2012. PMID: 22890249 Free PMC article.
-
Genetic differential sensitivity to social environments: implications for research.Am J Public Health. 2013 Oct;103 Suppl 1(Suppl 1):S102-10. doi: 10.2105/AJPH.2013.301382. Epub 2013 Aug 8. Am J Public Health. 2013. PMID: 23927507 Free PMC article.
References
-
- Boos, D. D., and J. Zhang, 2000. Monte Carlo evaluation of resampling-based hypothesis tests. J. Am. Stat. Assoc. 95 486–492.
-
- Box, G. E. P., 1954. Some theorems on quadratic forms applied in the study of analysis of variance problems, II. Effect of inequality of variance in the two-way classification. Ann. Math. Stat. 25 290–302.
-
- Cressie, N., and T. R. C. Read, 1984. Multinomial goodness-of-fit tests. J. R. Stat. Soc. B 46 440–464.
-
- Evett, I. W., and B. S. Weir, 1998. Interpreting DNA Evidence. Sinauer Associates, Sunderland, MA.
-
- Excoffier, L., and M. Slatkin, 1995. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol. Biol. Evol. 12 921–927. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
