A novel SCCA approach via truncated ℓ1-norm and truncated group lasso for brain imaging genetics

Bioinformatics. 2018 Jan 15;34(2):278-285. doi: 10.1093/bioinformatics/btx594.

Abstract

Motivation: Brain imaging genetics, which studies the linkage between genetic variations and structural or functional measures of the human brain, has become increasingly important in recent years. Discovering the bi-multivariate relationship between genetic markers such as single-nucleotide polymorphisms (SNPs) and neuroimaging quantitative traits (QTs) is one major task in imaging genetics. Sparse Canonical Correlation Analysis (SCCA) has been a popular technique in this area for its powerful capability in identifying bi-multivariate relationships coupled with feature selection. The existing SCCA methods impose either the ℓ1-norm or its variants to induce sparsity. The ℓ0-norm penalty is a perfect sparsity-inducing tool which, however, is an NP-hard problem.

Results: In this paper, we propose the truncated ℓ1-norm penalized SCCA to improve the performance and effectiveness of the ℓ1-norm based SCCA methods. Besides, we propose an efficient optimization algorithms to solve this novel SCCA problem. The proposed method is an adaptive shrinkage method via tuning τ. It can avoid the time intensive parameter tuning if given a reasonable small τ. Furthermore, we extend it to the truncated group-lasso (TGL), and propose TGL-SCCA model to improve the group-lasso-based SCCA methods. The experimental results, compared with four benchmark methods, show that our SCCA methods identify better or similar correlation coefficients, and better canonical loading profiles than the competing methods. This demonstrates the effectiveness and efficiency of our methods in discovering interesting imaging genetic associations.

Availability and implementation: The Matlab code and sample data are freely available at http://www.iu.edu/∼shenlab/tools/tlpscca/.

Supplementary information: Supplementary data are available at Bioinformatics online.