Learning interactions via hierarchical group-lasso regularization

J Comput Graph Stat. 2015;24(3):627-654. doi: 10.1080/10618600.2014.938812. Epub 2015 Sep 16.


We introduce a method for learning pairwise interactions in a linear regression or logistic regression model in a manner that satisfies strong hierarchy: whenever an interaction is estimated to be nonzero, both its associated main effects are also included in the model. We motivate our approach by modeling pairwise interactions for categorical variables with arbitrary numbers of levels, and then show how we can accommodate continuous variables as well. Our approach allows us to dispense with explicitly applying constraints on the main effects and interactions for identifiability, which results in interpretable interaction models. We compare our method with existing approaches on both simulated and real data, including a genome-wide association study, all using our R package glinternet.

Keywords: computer intensive; hierarchical; interaction; logistic; regression.