Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug 1;34(15):2538-2545.
doi: 10.1093/bioinformatics/bty147.

A Bayesian Framework for Multiple Trait Colocalization From Summary Association Statistics

Affiliations
Free PMC article

A Bayesian Framework for Multiple Trait Colocalization From Summary Association Statistics

Claudia Giambartolomei et al. Bioinformatics. .
Free PMC article

Abstract

Motivation: Most genetic variants implicated in complex diseases by genome-wide association studies (GWAS) are non-coding, making it challenging to understand the causative genes involved in disease. Integrating external information such as quantitative trait locus (QTL) mapping of molecular traits (e.g. expression, methylation) is a powerful approach to identify the subset of GWAS signals explained by regulatory effects. In particular, expression QTLs (eQTLs) help pinpoint the responsible gene among the GWAS regions that harbor many genes, while methylation QTLs (mQTLs) help identify the epigenetic mechanisms that impact gene expression which in turn affect disease risk. In this work, we propose multiple-trait-coloc (moloc), a Bayesian statistical framework that integrates GWAS summary data with multiple molecular QTL data to identify regulatory effects at GWAS risk loci.

Results: We applied moloc to schizophrenia (SCZ) and eQTL/mQTL data derived from human brain tissue and identified 52 candidate genes that influence SCZ through methylation. Our method can be applied to any GWAS and relevant functional data to help prioritize disease associated genes. Availability and implementation: moloc is available for download as an R package (https://github.com/clagiamba/moloc). We also developed a web site to visualize the biological findings (icahn.mssm.edu/moloc). The browser allows searches by gene, methylation probe and scenario of interest.

Supplementary information: Supplementary data are available at Bioinformatics online.

Figures

Fig. 1.
Fig. 1.
Graphical representation of four possible configurations at a locus with eight SNPs in common across three traits. The traits are labeled as G, E, M representing GWAS (G), eQTL (E) and mQTL (M) datasets, respectively. Each plot represents one possible configuration, which is a possible combination of three sets of binary vectors indicating whether the variant is associated with the selected trait. Left plot top panel (GEM scenario): points to one causal variant behind all of the associations; Right plot top panel (GE scenario): represent the scenario with the same causal variant behind the GE and no association or lack of power for the M association; Left plot bottom panel (GE.M scenario): represents the case with two causal variants, one shared by the G and E and a different causal variant for M; Right plot bottom panel (G.E.M. scenario): represents the case of three distinct causal variants behind each of the datasets considered
Fig. 2.
Fig. 2.
Results from simulations under colocalization/non-colocalization scenarios (A, B), and results from real data application (C). (A) Simulations under different sample sizes for all scenarios in moloc of three traits (GWAS, eQTL and mQTL). The y axis shows the median, 10% and 90% quantile of the distribution of posterior probabilities (‘PPA’), which supports each of our scenarios of interest. Combined scenarios include gene-methylation pairs or genes that reach a posterior probability of GEM >= 80%, or + GE.M>= 80%, or GE>= 80%. All cases include 10 000 individuals in the GWAS dataset. The variance explained by the trait was set to 0.01 for GWAS (1%), and to 0.1 (10%) for the eQTL and mQTL. (B) Posterior probabilities from simulations using a sample size of 10 000 individuals for GWAS trait (denoted as G), 300 for eQTL trait (denoted as E) and 300 for mQTL trait (denoted as M). X-axis shows all 15 simulated scenarios, e.g. G.E.M, three different causal variants for each of the three traits. Y-axis shows the distribution of posterior probabilities under the simulated scenario. The height of the bar represents the mean of the PPA for each configuration across simulations. (C) Venn diagram comparing number of colocalization of two traits (coloc PPA >=80%) with three traits (moloc PPA GE + GE.M + GEM)

Similar articles

See all similar articles

Cited by 21 articles

See all "Cited by" articles

Publication types

Feedback