Scalable Orthonormal Projective NMF via Diversified Stochastic Optimization

Abdalla Bani; Sung Min Ha; Pan Xiao; Thomas Earnest; John Lee; Aristeidis Sotiras

doi:10.1007/978-3-031-34048-2_38

Scalable Orthonormal Projective NMF via Diversified Stochastic Optimization

Inf Process Med Imaging. 2023:13939:497-508. doi: 10.1007/978-3-031-34048-2_38. Epub 2023 Jun 8.

Authors

Abdalla Bani¹, Sung Min Ha¹, Pan Xiao¹, Thomas Earnest¹, John Lee¹, Aristeidis Sotiras^{1

2}

Affiliations

¹ Department of Radiology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA.
² Institute for Informatics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA.

PMID: 37969113
PMCID: PMC10642358 (available on 2024-06-08)
DOI: 10.1007/978-3-031-34048-2_38

Abstract

The increasing availability of large-scale neuroimaging initiatives opens exciting opportunities for discovery science of human brain structure and function. Data-driven techniques, such as Orthonormal Projective Non-negative Matrix Factorization (opNMF), are well positioned to explore multivariate relationships in big data towards uncovering brain organization. opNMF enjoys advantageous interpretability and reproducibility compared to commonly used matrix factorization methods like Principal Component Analysis (PCA) and Independent Component Analysis (ICA), which led to its wide adoption in clinical computational neuroscience. However, applying opNMF in large-scale cohort studies is hindered by its limited scalability caused by its accompanying computational complexity. In this work, we address the computational challenges of opNMF using a stochastic optimization approach that learns over mini-batches of the data. Additionally, we diversify the stochastic batches via repulsive point processes, which reduce redundancy in the mini-batches and in turn lead to lower variance in the updates. We validated our framework on gray matter tissue density maps estimated from 1000 subjects part of the Open Access Series of Imaging (OASIS) dataset. We demonstrated that operations over mini-batches of data yield significant reduction in computational cost. Importantly, we showed that our novel optimization does not compromise the accuracy or interpretability of factors when compared to standard opNMF. The proposed model enables new investigations of brain structure using big neuroimaging data that could improve our understanding of brain structure in health and disease.

Keywords: Big data; MRI; NMF; Stochastic optimization.

Abstract

Grants and funding