It is believed that interactions among genes (epistasis) may play an important role in susceptibility to common diseases (Moore and Williams . Ann Med 34:88-95; Ritchie et al. . Am J Hum Genet 69:138-147). To study the underlying genetic variants of diseases, genome-wide association studies (GWAS) that simultaneously assay several hundreds of thousands of SNPs are being increasingly used. Often, the data from these studies are analyzed with single-locus methods (Lambert et al. . Nat Genet 41:1094-1099; Reiman et al. . Neuron 54:713-720). However, epistatic interactions may not be easily detected with single-locus methods (Marchini et al. . Nat Genet 37:413-417). As a result, both parametric and nonparametric multi-locus methods have been developed to detect such interactions (Heidema et al. . BMC Genet 7:23). However, efficiently analyzing epistasis using high-dimensional genome-wide data remains a crucial challenge. We develop a method based on Bayesian networks and the minimum description length principle for detecting epistatic interactions. We compare its ability to detect gene-gene interactions and its efficiency to that of the combinatorial method multifactor dimensionality reduction (MDR) using 28,000 simulated data sets generated from 70 different genetic models We further apply the method to over 300,000 SNPs obtained from a GWAS involving late onset Alzheimer's disease (LOAD). Our method outperforms MDR and we substantiate previous results indicating that the GAB2 gene is associated with LOAD. To our knowledge, this is the first successful model-based epistatic analysis using a high-dimensional genome-wide data set.
(c) 2010 Wiley-Liss, Inc.