Detection of gene-gene interactions using multistage sparse and low-rank regression

Biometrics. 2016 Mar;72(1):85-94. doi: 10.1111/biom.12374. Epub 2015 Aug 19.


Finding an efficient and computationally feasible approach to deal with the curse of high-dimensionality is a daunting challenge faced by modern biological science. The problem becomes even more severe when the interactions are the research focus. To improve the performance of statistical analyses, we propose a sparse and low-rank (SLR) screening based on the combination of a low-rank interaction model and the Lasso screening. SLR models the interaction effects using a low-rank matrix to achieve parsimonious parametrization. The low-rank model increases the efficiency of statistical inference and, hence, SLR screening is able to more accurately detect gene-gene interactions than conventional methods. Incorporation of SLR screening into the Screen-and-Clean approach (Wasserman and Roeder, 2009; Wu et al., 2010) is also discussed, which suffers less penalty from Boferroni correction, and is able to assign p-values for the identified variables in high-dimensional model. We apply the proposed screening procedure to the Warfarin dosage study and the CoLaus study. The results suggest that the new procedure can identify main and interaction effects that would have been omitted by conventional screening methods.

Keywords: Asymptotic normality; Gene-gene interactions; Low-rank approximation; Over-parametrization; Screen-and-Clean; Sparsity.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Data Interpretation, Statistical*
  • High-Throughput Screening Assays / methods*
  • Models, Statistical*
  • Pattern Recognition, Automated / methods
  • Protein Interaction Mapping / methods*
  • Regression Analysis*
  • Reproducibility of Results
  • Sensitivity and Specificity