Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jul;7(7):e1002177.
doi: 10.1371/journal.pgen.1002177. Epub 2011 Jul 28.

Gene-based Tests of Association

Free PMC article

Gene-based Tests of Association

Hailiang Huang et al. PLoS Genet. .
Free PMC article


Genome-wide association studies (GWAS) are now used routinely to identify SNPs associated with complex human phenotypes. In several cases, multiple variants within a gene contribute independently to disease risk. Here we introduce a novel Gene-Wide Significance (GWiS) test that uses greedy Bayesian model selection to identify the independent effects within a gene, which are combined to generate a stronger statistical signal. Permutation tests provide p-values that correct for the number of independent tests genome-wide and within each genetic locus. When applied to a dataset comprising 2.5 million SNPs in up to 8,000 individuals measured for various electrocardiography (ECG) parameters, this method identifies more validated associations than conventional GWAS approaches. The method also provides, for the first time, systematic assessments of the number of independent effects within a gene and the fraction of disease-associated genes housing multiple independent effects, observed at 35%-50% of loci in our study. This method can be generalized to other study designs, retains power for low-frequency alleles, and provides gene-based p-values that are directly compatible for pathway-based meta-analysis.

Conflict of interest statement

The authors have declared that no competing interests exist.


Figure 1
Figure 1. Estimated power at genome-wide significance for simulated data.
Power estimates for GWiS (black), minSNP-P (blue), BIMBAM (dashed blue), VEGAS (green), and LASSO (red) are shown for 0.007 population variance explained by a gene. Genes were selected at random from Chr 1; genotypes were taken from ARIC; and phenotypes were simulated according to known models with up to 8 causal variants with independent effects. (a) Power decreases as total variance is diluted over an increasing number of causal variants. (b) Power estimates with 95% confidence intervals are shown as a function of minor allele frequency (MAF) for the simulations from panel (a) with a single independent effect. GWiS, minSNP, minSNP-P, and BIMBAM are robust to low minor allele frequency, whereas VEGAS and LASSO lose power.
Figure 2
Figure 2. Model size estimation.
The ability to recover the known model size was evaluated for GWiS (a and b) and LASSO (c and d). The power to detect a single SNP was set to be 10% (a and c) and 80% (b and d). In separate tests, the causal SNPs were either retained in (black) or removed from (red) the genotype data.
Figure 3
Figure 3. Recovery of known positive associations at genome-wide significance.
Of 38 known positives, GWiS identified 6 at genome-wide significance with no false positives. Univariate methods (minSNP and minSNP-P) and VEGAS identified a subset of 4 entirely contained by GWiS, and LASSO identified a smaller subset of 2.
Figure 4
Figure 4. Multiple weak effects identified as genome-wide significant.
GWiS correctly identifies the SCN5A-SCN10A locus as genome-wide significant with four independent effects, even though the strongest single effect has a p-value 100formula image worse than the genome-wide significance threshold indicated as a dashed line. No other method was able to identify this locus as genome-wide significant. The SNPs selected by GWiS are represented as large, colored diamonds, and SNPs in LD with these four are colored in lighter shades. The light blue trace indicates recombination hotspots.
Figure 5
Figure 5. Distribution of the number of independent effects in ECG loci.
Of 38 known positive loci, GWiS identified 20 loci, and 7 of these contain multiple independent effects.

Similar articles

See all similar articles

Cited by 50 articles

See all "Cited by" articles


    1. Pfeufer A, van Noord C, Marciante KD, Arking DE, Larson MG, et al. Nat Genet; 2010. Genome-wide association study of pr interval.
    1. Sotoodehnia N, Isaacs A, de Bakker PIW, Drr M, Newton-Cheh C, et al. Nat Genet; 2010. Common variants in 22 loci are associated with qrs duration and cardiac ventricular conduction. - PMC - PubMed
    1. Arking DE, Pfeufer A, Post W, Kao WH, Newton-Cheh C, et al. A common genetic variant in the nos1 regulator nos1ap modulates cardiac repolarization. Nat Genet. 2006;38:644–51. - PubMed
    1. Pfeufer A, Sanna S, Arking DE, Muller M, Gateva V, et al. Common variants at ten loci modulate the qt interval duration in the qtscd study. Nat Genet. 2009;41:407–14. - PMC - PubMed
    1. Newton-Cheh C, Eijgelsheim M, Rice KM, de Bakker PI, Yin X, et al. Common variants at ten loci inuence qt interval duration in the qtgen study. Nat Genet. 2009;41:399–406. - PMC - PubMed

Publication types

Grant support