Mendelian randomization is the use of genetic variants as instruments to assess the existence of a causal relationship between a risk factor and an outcome. A Mendelian randomization analysis requires a set of genetic variants that are strongly associated with the risk factor and only associated with the outcome through their effect on the risk factor. We describe a novel variable selection algorithm for Mendelian randomization that can identify sets of genetic variants which are suitable in both these respects. Our algorithm is applicable in the context of two-sample summary-data Mendelian randomization and employs a recently proposed theoretical extension of the traditional Bayesian statistics framework, including a loss function to penalize genetic variants that exhibit pleiotropic effects. The algorithm offers robust inference through the use of model averaging, as we illustrate by running it on a range of simulation scenarios and comparing it against established pleiotropy-robust Mendelian randomization methods. In a real-data application, we study the effect of systolic and diastolic blood pressure on the risk of suffering from coronary heart disease (CHD). Based on a recent large-scale GWAS for blood pressure, we use 395 genetic variants for systolic and 391 variants for diastolic blood pressure. Both traits are shown to have significant risk-increasing effects on CHD risk.
Keywords: Mendelian randomization; general Bayesian inference; instrumental variables; pleiotropy; variable selection.
© 2021 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.