Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jul;21(7):1122-30.
doi: 10.1101/gr.115832.110. Epub 2011 Mar 25.

Detection of Common Single Nucleotide Polymorphisms Synthesizing Quantitative Trait Association of Rarer Causal Variants

Affiliations
Free PMC article

Detection of Common Single Nucleotide Polymorphisms Synthesizing Quantitative Trait Association of Rarer Causal Variants

Fumihiko Takeuchi et al. Genome Res. .
Free PMC article

Abstract

Genome-wide association (GWA) studies have identified hundreds of common (minor allele frequency ≥5%) single nucleotide polymorphisms (SNPs) associated with phenotype traits or diseases, yet causal variants accounting for the association signals have rarely been determined. A question then raised is whether a GWA signal represents an "indirect association" as a proxy of a strongly correlated causal variant with similar frequency, or a "synthetic association" of one or more rarer causal variants in linkage disequilibrium (D' ≈ 1, but r(2) not large); answering the question generally requires extensive resequencing and association analysis. Instead, we propose to test statistically whether a quantitative trait (QT) association of an SNP represents a synthetic association or not by inspecting the QT distribution at each genotype, not requiring the causal variant(s) to be known. We devised two test statistics and assessed the power by mathematical analysis and simulation. Testing the heterogeneity of variance was powerful when low-frequency causal alleles are linked mostly to one SNP allele, while testing the skewness outperformed when the causal alleles are linked evenly to either of the SNP alleles. By testing a statistic combining these two in 5000 individuals, we could detect synthetic association of a GWA signal when causal alleles sum up to 3% in frequency. Such signal only partially explains the heritability contributed by the whole locus. The proposed test is useful for designing fine mapping after studying association of common SNPs exhaustively; we can prioritize which GWA signal and which individuals to be resequenced, and identify the causal variants efficiently.

Figures

Figure 1.
Figure 1.
Probability distribution of the QT value within subgroups classified by marker SNP genotypes. (A) In the whole population, the total QT distribution (gray curve) comprises a mixture of normal distributions (black curves) with unit variance and the mean 0, 1, or 2, which correspond to genotypes b1/b1, B1/b1, and B1/B1 at the causal variant. As genotype B1/B1 is rare (0.25%), the corresponding curve appears flat. (B) QT distribution among individuals with A/A genotype at the marker. As B1/B1 and B1/b1 genotypes are enriched in this subgroup due to LD, the variance is enlarged, as noticeable from the lower peak and wider distribution of the gray curve. (C) Individuals with the A/a genotype have either genotypes b1/b1 or B1/b1, and the QT variance is moderately enlarged. (D) All individuals with a/a genotype at the marker have b1/b1 genotype at the causal variant. The QT variance is 1.10 in A, 1.38 in B, 1.19 in C, and 1 in D.
Figure 2.
Figure 2.
Haplotype classes (A), their phylogeny (B) for the marker SNP rs405509 showing synthetic association of functional variants rs7412 and rs429358 in the APOE locus. LD coefficients between the SNPs associated with LDL-C (C). Haplotype frequencies were calculated using the PLINK software (Purcell et al. 2007).
Figure 3.
Figure 3.
Power for detecting synthetic association by testing heteroscedasticity. The power was computed from simulation under four representative genetic models of synthetic association (see Methods), assuming the strength of marker association (R2mrk) of 0.00592. Horizontal and vertical axes represent the frequency of the marker allele A, and the cumulative frequency of causal alleles Bi (linked to allele A), respectively. The asterisk indicates the region where synthetic association is detectable with power >0.8. The black region of the parameter space should be neglected, as it does not include causal variants accounting for the marker association.
Figure 4.
Figure 4.
Power for detecting synthetic association by testing skewness. The power was computed from simulation under four representative genetic models, assuming the strength of marker association (R2mrk) of 0.00592. The format of the figure is the same as Figure 3.
Figure 5.
Figure 5.
Power for detecting synthetic association by the combined test of heteroscedasticity and skewness. The power was computed from simulation under four representative genetic models, assuming the strength of marker association (R2mrk) of 0.00592. The format of the figure is the same as Figure 3.

Similar articles

See all similar articles

Cited by 8 articles

See all "Cited by" articles

Publication types

Substances

LinkOut - more resources

Feedback