Genetic Programming Neural Networks: A Powerful Bioinformatics Tool for Human Genetics

Appl Soft Comput. 2007 Jan;7(1):471-479. doi: 10.1016/j.asoc.2006.01.013.


The identification of genes that influence the risk of common, complex disease primarily through interactions with other genes and environmental factors remains a statistical and computational challenge in genetic epidemiology. This challenge is partly due to the limitations of parametric statistical methods for detecting genetic effects that are dependent solely or partially on interactions. We have previously introduced a genetic programming neural network (GPNN) as a method for optimizing the architecture of a neural network to improve the identification of genetic and gene-environment combinations associated with disease risk. Previous empirical studies suggest GPNN has excellent power for identifying gene-gene and gene-environment interactions. The goal of this study was to compare the power of GPNN to stepwise logistic regression (SLR) and classification and regression trees (CART) for identifying gene-gene and gene-environment interactions. SLR and CART are standard methods of analysis for genetic association studies. Using simulated data, we show that GPNN has higher power to identify gene-gene and gene-environment interactions than SLR and CART. These results indicate that GPNN may be a useful pattern recognition approach for detecting gene-gene and gene-environment interactions in studies of human disease.