Machine learning is gaining prominence in the health sciences, where much of its use has focused on data-driven prediction. However, machine learning can also be embedded within causal analyses, potentially reducing biases arising from model misspecification. Using a question-and-answer format, we provide an introduction and orientation for epidemiologists interested in using machine learning but concerned about potential bias or loss of rigor due to use of "black box" models. We conclude with sample software code that may lower the barrier to entry to using these techniques.
Keywords: causal inference; double-robustness; epidemiologic methods; inverse probability weighting; machine learning; propensity score; targeted maximum likelihood estimation.
© The Author(s) 2021. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.