Background: Growing interest on biological pathways has called for new statistical methods for modeling and testing a genetic pathway effect on a health outcome. The fact that genes within a pathway tend to interact with each other and relate to the outcome in a complicated way makes nonparametric methods more desirable. The kernel machine method provides a convenient, powerful and unified method for multi-dimensional parametric and nonparametric modeling of the pathway effect.
Results: In this paper we propose a logistic kernel machine regression model for binary outcomes. This model relates the disease risk to covariates parametrically, and to genes within a genetic pathway parametrically or nonparametrically using kernel machines. The nonparametric genetic pathway effect allows for possible interactions among the genes within the same pathway and a complicated relationship of the genetic pathway and the outcome. We show that kernel machine estimation of the model components can be formulated using a logistic mixed model. Estimation hence can proceed within a mixed model framework using standard statistical software. A score test based on a Gaussian process approximation is developed to test for the genetic pathway effect. The methods are illustrated using a prostate cancer data set and evaluated using simulations. An extension to continuous and discrete outcomes using generalized kernel machine models and its connection with generalized linear mixed models is discussed.
Conclusion: Logistic kernel machine regression and its extension generalized kernel machine regression provide a novel and flexible statistical tool for modeling pathway effects on discrete and continuous outcomes. Their close connection to mixed models and attractive performance make them have promising wide applications in bioinformatics and other biomedical areas.