The identification and characterization of genes that influence the risk of common, complex multifactorial diseases, primarily through interactions with other genes and other environmental factors, remains a statistical and computational challenge in genetic epidemiology. This challenge is partly due to the limitations of parametric statistical methods for detecting genetic effects that are dependent solely or partially on interactions with other genes and environmental exposures. We previously introduced multifactor dimensionality reduction (MDR) as a method for reducing the dimensionality of multilocus genotype information to improve the identification of polymorphism combinations associated with disease risk. The MDR approach is nonparametric (i.e., no hypothesis about the value of a statistical parameter is made), is model-free (i.e., assumes no particular inheritance model), and is directly applicable to case-control and discordant sib-pair study designs. Both empirical and theoretical studies suggest that MDR has excellent power for identifying high-order gene-gene interactions. However, the power of MDR for identifying gene-gene interactions in the presence of common sources of noise is not currently known. The goal of this study was to evaluate the power of MDR for identifying gene-gene interactions in the presence of noise due to genotyping error, missing data, phenocopy, and genetic or locus heterogeneity. Using simulated data, we show that MDR has high power to identify gene-gene interactions in the presence of 5% genotyping error, 5% missing data, or a combination of both. However, MDR has reduced power for some models in the presence of 50% phenocopy, and very limited power in the presence of 50% genetic heterogeneity. Extending MDR to address genetic heterogeneity should be a priority for the continued methodological development of this new approach.
Copyright 2003 Wiley-Liss, Inc.