Probabilistic inference of the genetic architecture underlying functional enrichment of complex traits

Nat Commun. 2021 Nov 30;12(1):6972. doi: 10.1038/s41467-021-27258-9.


We develop a Bayesian model (BayesRR-RC) that provides robust SNP-heritability estimation, an alternative to marker discovery, and accurate genomic prediction, taking 22 seconds per iteration to estimate 8.4 million SNP-effects and 78 SNP-heritability parameters in the UK Biobank. We find that only ≤10% of the genetic variation captured for height, body mass index, cardiovascular disease, and type 2 diabetes is attributable to proximal regulatory regions within 10kb upstream of genes, while 12-25% is attributed to coding regions, 32-44% to introns, and 22-28% to distal 10-500kb upstream regions. Up to 24% of all cis and coding regions of each chromosome are associated with each trait, with over 3,100 independent exonic and intronic regions and over 5,400 independent regulatory regions having ≥95% probability of contributing ≥0.001% to the genetic variance of these four traits. Our open-source software (GMRM) provides a scalable alternative to current approaches for biobank data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Body Height
  • Body Mass Index
  • Cardiovascular Diseases
  • Diabetes Mellitus, Type 2
  • Genetic Techniques
  • Genetic Variation
  • Genome-Wide Association Study*
  • Genomics*
  • Genotype
  • Humans
  • Introns
  • Models, Statistical
  • Multifactorial Inheritance / genetics*
  • Open Reading Frames
  • Phenotype
  • Software

Associated data

  • Dryad/10.5061/dryad.sqv9s4n51