Structured Sparse Principal Components Analysis With the TV-Elastic Net Penalty

Amicie de Pierrefeu; Tommy Lofstedt; Fouad Hadj-Selem; Mathieu Dubois; Renaud Jardri; Thomas Fovet; Philippe Ciuciu; Vincent Frouin; Edouard Duchesnay

doi:10.1109/TMI.2017.2749140

Structured Sparse Principal Components Analysis With the TV-Elastic Net Penalty

IEEE Trans Med Imaging. 2018 Feb;37(2):396-407. doi: 10.1109/TMI.2017.2749140. Epub 2017 Sep 4.

Authors

Amicie de Pierrefeu, Tommy Lofstedt, Fouad Hadj-Selem, Mathieu Dubois, Renaud Jardri, Thomas Fovet, Philippe Ciuciu, Vincent Frouin, Edouard Duchesnay

PMID: 28880163
DOI: 10.1109/TMI.2017.2749140

Abstract

Principal component analysis (PCA) is an exploratory tool widely used in data analysis to uncover the dominant patterns of variability within a population. Despite its ability to represent a data set in a low-dimensional space, PCA's interpretability remains limited. Indeed, the components produced by PCA are often noisy or exhibit no visually meaningful patterns. Furthermore, the fact that the components are usually non-sparse may also impede interpretation, unless arbitrary thresholding is applied. However, in neuroimaging, it is essential to uncover clinically interpretable phenotypic markers that would account for the main variability in the brain images of a population. Recently, some alternatives to the standard PCA approach, such as sparse PCA (SPCA), have been proposed, their aim being to limit the density of the components. Nonetheless, sparsity alone does not entirely solve the interpretability problem in neuroimaging, since it may yield scattered and unstable components. We hypothesized that the incorporation of prior information regarding the structure of the data may lead to improved relevance and interpretability of brain patterns. We therefore present a simple extension of the popular PCA framework that adds structured sparsity penalties on the loading vectors in order to identify the few stable regions in the brain images that capture most of the variability. Such structured sparsity can be obtained by combining, e.g., and total variation (TV) penalties, where the TV regularization encodes information on the underlying structure of the data. This paper presents the structured SPCA (denoted SPCA-TV) optimization framework and its resolution. We demonstrate SPCA-TV's effectiveness and versatility on three different data sets. It can be applied to any kind of structured data, such as, e.g., -dimensional array images or meshes of cortical surfaces. The gains of SPCA-TV over unstructured approaches (such as SPCA and ElasticNet PCA) or structured approach (such as GraphNet PCA) are significant, since SPCA-TV reveals the variability within a data set in the form of intelligible brain patterns that are easier to interpret and more stable across different samples.

MeSH terms

Algorithms
Brain / diagnostic imaging
Humans
Image Processing, Computer-Assisted / methods*
Magnetic Resonance Imaging / methods*
Neuroimaging
Principal Component Analysis / methods*
Unsupervised Machine Learning