Mendelian randomization analysis with multiple genetic variants using summarized data

Genet Epidemiol. 2013 Nov;37(7):658-65. doi: 10.1002/gepi.21758. Epub 2013 Sep 20.


Genome-wide association studies, which typically report regression coefficients summarizing the associations of many genetic variants with various traits, are potentially a powerful source of data for Mendelian randomization investigations. We demonstrate how such coefficients from multiple variants can be combined in a Mendelian randomization analysis to estimate the causal effect of a risk factor on an outcome. The bias and efficiency of estimates based on summarized data are compared to those based on individual-level data in simulation studies. We investigate the impact of gene-gene interactions, linkage disequilibrium, and 'weak instruments' on these estimates. Both an inverse-variance weighted average of variant-specific associations and a likelihood-based approach for summarized data give similar estimates and precision to the two-stage least squares method for individual-level data, even when there are gene-gene interactions. However, these summarized data methods overstate precision when variants are in linkage disequilibrium. If the P-value in a linear regression of the risk factor for each variant is less than 1×10⁻⁵, then weak instrument bias will be small. We use these methods to estimate the causal association of low-density lipoprotein cholesterol (LDL-C) on coronary artery disease using published data on five genetic variants. A 30% reduction in LDL-C is estimated to reduce coronary artery disease risk by 67% (95% CI: 54% to 76%). We conclude that Mendelian randomization investigations using summarized data from uncorrelated variants are similarly efficient to those using individual-level data, although the necessary assumptions cannot be so fully assessed.

Keywords: Mendelian randomization; causal inference; genome-wide association study; instrumental variables; weak instruments.

MeSH terms

  • Bias
  • Cholesterol, LDL / biosynthesis
  • Cholesterol, LDL / genetics
  • Cholesterol, LDL / metabolism
  • Coronary Disease / genetics
  • Coronary Disease / metabolism
  • Coronary Disease / physiopathology
  • Genes / genetics
  • Genetic Variation / genetics*
  • Genome-Wide Association Study
  • Humans
  • Least-Squares Analysis
  • Likelihood Functions
  • Linear Models
  • Linkage Disequilibrium / genetics
  • Mendelian Randomization Analysis / methods*
  • Models, Genetic
  • Odds Ratio
  • Phenotype
  • Risk Factors


  • Cholesterol, LDL