Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Nov 2;8:428.
doi: 10.1186/1471-2105-8-428.

Cubic Exact Solutions for the Estimation of Pairwise Haplotype Frequencies: Implications for Linkage Disequilibrium Analyses and a Web Tool 'CubeX'

Affiliations
Free PMC article

Cubic Exact Solutions for the Estimation of Pairwise Haplotype Frequencies: Implications for Linkage Disequilibrium Analyses and a Web Tool 'CubeX'

Tom R Gaunt et al. BMC Bioinformatics. .
Free PMC article

Abstract

Background: The frequency of a haplotype comprising one allele at each of two loci can be expressed as a cubic equation (the 'Hill equation'), the solution of which gives that frequency. Most haplotype and linkage disequilibrium analysis programs use iteration-based algorithms which substitute an estimate of haplotype frequency into the equation, producing a new estimate which is repeatedly fed back into the equation until the values converge to a maximum likelihood estimate (expectation-maximisation).

Results: We present a program, "CubeX", which calculates the biologically possible exact solution(s) and provides estimated haplotype frequencies, D', r2 and chi2 values for each. CubeX provides a "complete" analysis of haplotype frequencies and linkage disequilibrium for a pair of biallelic markers under situations where sampling variation and genotyping errors distort sample Hardy-Weinberg equilibrium, potentially causing more than one biologically possible solution. We also present an analysis of simulations and real data using the algebraically exact solution, which indicates that under perfect sample Hardy-Weinberg equilibrium there is only one biologically possible solution, but that under other conditions there may be more.

Conclusion: Our analyses demonstrate that lower allele frequencies, lower sample numbers, population stratification and a possible |D'| value of 1 are particularly susceptible to distortion of sample Hardy-Weinberg equilibrium, which has significant implications for calculation of linkage disequilibrium in small sample sizes (eg HapMap) and rarer alleles (eg paucimorphisms, q < 0.05) that may have particular disease relevance and require improved approaches for meaningful evaluation.

Figures

Figure 1
Figure 1
Simulated data in which HWE is observed to the limit of rounding errors (whole number values for counts of individuals). (A) Number of biologically possible solutions to the cubic equation in (A) 10 individuals; (B) 60 individuals; (C) 100 individuals (D) 1000 individuals. x-axis: allele frequency of SNP1, y-axis: allele frequency of SNP2. Black = more than one solution. Grey = one solution.
Figure 2
Figure 2
Evaluation of number of solutions for real data. (A) Number of biologically possible solutions over a range of allele frequencies using a large sample of SNP data (Chr. 17:60 to 60.5 MB, 121 SNPs) from the HapMap project [23,24]. x-axis: allele frequency of SNP1, y-axis: allele frequency of SNP2. Black = more than one solution. Grey = one solution. (B) Comparison of two solutions within the dataset. x-axis: higher value solution, y-axis: lower value solution.
Figure 3
Figure 3
Screenshot of results screen from CubeX online analysis program. In this example there are two biologically possible solutions. Results for both are shown (upper table), and observed (input values) and expected diplotype frequencies (for the two solutions) displayed for comparison (lower table).
Figure 4
Figure 4
The range of LD in datasets using the CubeX tool to calculate r2 and D'. (A) Simulated data. D' on x-axis, r2 on y axis. (B) Real SNP data (Chr. 17:60 to 60.5 MB, 121 SNPs) from the HapMap project [23,24]. D' on x-axis, r2 on y axis.

Similar articles

See all similar articles

Cited by 100 articles

See all "Cited by" articles

References

    1. Weiss KM, Clark AG. Linkage disequilibrium and the mapping of complex human traits. Trends in Genetics. 2002;18:19–24. doi: 10.1016/S0168-9525(01)02550-1. - DOI - PubMed
    1. Palmer LJ, Cardon LR. Shaking the tree: mapping complex disease genes with linkage disequilibrium. The Lancet. 2005;366:1223–1234. doi: 10.1016/S0140-6736(05)67485-5. - DOI - PubMed
    1. Ardlie KG, Kruglyak L, Seielstad M. Patterns of linkage disequilibrium in the human genome. Nat Rev Genet. 2002;3:299–309. doi: 10.1038/nrg777. - DOI - PubMed
    1. Hill WG. Estimation of linkage disequilibrium in randomly mating populations. Heredity. 1974;33:229–239. - PubMed
    1. Abecasis GR, Cookson WO. GOLD--graphical overview of linkage disequilibrium. Bioinformatics. 2000;16:182–183. doi: 10.1093/bioinformatics/16.2.182. - DOI - PubMed

Publication types

LinkOut - more resources

Feedback