An Analytic Approach Using Candidate Gene Selection and Logic Forest to Identify Gene by Environment Interactions (G × E) for Systemic Lupus Erythematosus in African Americans

Genes (Basel). 2018 Oct 15;9(10):496. doi: 10.3390/genes9100496.


Development and progression of many human diseases, such as systemic lupus erythematosus (SLE), are hypothesized to result from interactions between genetic and environmental factors. Current approaches to identify and evaluate interactions are limited, most often focusing on main effects and two-way interactions. While higher order interactions associated with disease are documented, they are difficult to detect since expanding the search space to all possible interactions of p predictors means evaluating 2p - 1 terms. For example, data with 150 candidate predictors requires considering over 1045 main effects and interactions. In this study, we present an analytical approach involving selection of candidate single nucleotide polymorphisms (SNPs) and environmental and/or clinical factors and use of Logic Forest to identify predictors of disease, including higher order interactions, followed by confirmation of the association between those predictors and interactions identified with disease outcome using logistic regression. We applied this approach to a study investigating whether smoking and/or secondhand smoke exposure interacts with candidate SNPs resulting in elevated risk of SLE. The approach identified both genetic and environmental risk factors, with evidence suggesting potential interactions between exposure to secondhand smoke as a child and genetic variation in the ITGAM gene associated with increased risk of SLE.

Keywords: candidate genes; gene–environment interactions; logic forest; systemic lupus erythematosus.