Hierarchical joint analysis of marginal summary statistics-Part I: Multipopulation fine mapping and credible set construction

Genet Epidemiol. 2024 Sep;48(6):241-257. doi: 10.1002/gepi.22562. Epub 2024 Apr 12.

Abstract

Recent advancement in genome-wide association studies (GWAS) comes from not only increasingly larger sample sizes but also the shift in focus towards underrepresented populations. Multipopulation GWAS increase power to detect novel risk variants and improve fine-mapping resolution by leveraging evidence and differences in linkage disequilibrium (LD) from diverse populations. Here, we expand upon our previous approach for single-population fine-mapping through Joint Analysis of Marginal SNP Effects (JAM) to a multipopulation analysis (mJAM). Under the assumption that true causal variants are common across studies, we implement a hierarchical model framework that conditions on multiple SNPs while explicitly incorporating the different LD structures across populations. The mJAM framework can be used to first select index variants using the mJAM likelihood with different feature selection approaches. In addition, we present a novel approach leveraging the ideas of mediation to construct credible sets for these index variants. Construction of such credible sets can be performed given any existing index variants. We illustrate the implementation of the mJAM likelihood through two implementations: mJAM-SuSiE (a Bayesian approach) and mJAM-Forward selection. Through simulation studies based on realistic effect sizes and levels of LD, we demonstrated that mJAM performs well for constructing concise credible sets that include the underlying causal variants. In real data examples taken from the most recent multipopulation prostate cancer GWAS, we showed several practical advantages of mJAM over other existing multipopulation methods.

Keywords: GWAS; diverse populations; fine‐mapping; summary statistics.

MeSH terms

  • Bayes Theorem*
  • Chromosome Mapping / methods
  • Chromosome Mapping / statistics & numerical data
  • Computer Simulation
  • Genome-Wide Association Study* / methods
  • Humans
  • Likelihood Functions
  • Linkage Disequilibrium*
  • Male
  • Models, Genetic
  • Models, Statistical
  • Polymorphism, Single Nucleotide*
  • Prostatic Neoplasms / genetics