Graphical analysis for phenome-wide causal discovery in genotyped population-scale biobanks

Nat Commun. 2021 Jan 13;12(1):350. doi: 10.1038/s41467-020-20516-2.


Causal inference via Mendelian randomization requires making strong assumptions about horizontal pleiotropy, where genetic instruments are connected to the outcome not only through the exposure. Here, we present causal Graphical Analysis Using Genetics (cGAUGE), a pipeline that overcomes these limitations using instrument filters with provable properties. This is achievable by identifying conditional independencies while examining multiple traits. cGAUGE also uses ExSep (Exposure-based Separation), a novel test for the existence of causal pathways that does not require selecting instruments. In simulated data we illustrate how cGAUGE can reduce the empirical false discovery rate by up to 30%, while retaining the majority of true discoveries. On 96 complex traits from 337,198 subjects from the UK Biobank, our results cover expected causal links and many new ones that were previously suggested by correlation-based observational studies. Notably, we identify multiple risk factors for cardiovascular disease, including red blood cell distribution width.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Biological Specimen Banks*
  • Cardiovascular Diseases / genetics
  • Causality
  • Computer Simulation
  • Gene Regulatory Networks
  • Genetic Pleiotropy / genetics*
  • Genetic Variation
  • Genome-Wide Association Study / methods*
  • Genotype
  • Humans
  • Mendelian Randomization Analysis / methods
  • Models, Theoretical
  • Multifactorial Inheritance / genetics*
  • Phenotype
  • Risk Factors