Integrating comprehensive functional annotations to boost power and accuracy in gene-based association analysis

PLoS Genet. 2020 Dec 15;16(12):e1009060. doi: 10.1371/journal.pgen.1009060. eCollection 2020 Dec.


Gene-based association tests aggregate genotypes across multiple variants for each gene, providing an interpretable gene-level analysis framework for genome-wide association studies (GWAS). Early gene-based test applications often focused on rare coding variants; a more recent wave of gene-based methods, e.g. TWAS, use eQTLs to interrogate regulatory associations. Regulatory variants are expected to be particularly valuable for gene-based analysis, since most GWAS associations to date are non-coding. However, identifying causal genes from regulatory associations remains challenging and contentious. Here, we present a statistical framework and computational tool to integrate heterogeneous annotations with GWAS summary statistics for gene-based analysis, applied with comprehensive coding and tissue-specific regulatory annotations. We compare power and accuracy identifying causal genes across single-annotation, omnibus, and annotation-agnostic gene-based tests in simulation studies and an analysis of 128 traits from the UK Biobank, and find that incorporating heterogeneous annotations in gene-based association analysis increases power and performance identifying causal genes.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Genome-Wide Association Study / methods*
  • Genome-Wide Association Study / standards
  • Humans
  • Molecular Sequence Annotation / methods*
  • Molecular Sequence Annotation / standards
  • Polymorphism, Genetic
  • Quantitative Trait Loci
  • Reproducibility of Results