Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis

Bioinformatics. 2007 Jun 15;23(12):1495-502. doi: 10.1093/bioinformatics/btm134. Epub 2007 May 5.

Abstract

Motivation: Many practical pattern recognition problems require non-negativity constraints. For example, pixels in digital images and chemical concentrations in bioinformatics are non-negative. Sparse non-negative matrix factorizations (NMFs) are useful when the degree of sparseness in the non-negative basis matrix or the non-negative coefficient matrix in an NMF needs to be controlled in approximating high-dimensional data in a lower dimensional space.

Results: In this article, we introduce a novel formulation of sparse NMF and show how the new formulation leads to a convergent sparse NMF algorithm via alternating non-negativity-constrained least squares. We apply our sparse NMF algorithm to cancer-class discovery and gene expression data analysis and offer biological analysis of the results obtained. Our experimental results illustrate that the proposed sparse NMF algorithm often achieves better clustering performance with shorter computing time compared to other existing NMF algorithms.

Availability: The software is available as supplementary material.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Computational Biology / methods*
  • Data Interpretation, Statistical*
  • Databases, Genetic
  • Entropy
  • Factor Analysis, Statistical
  • Gene Expression
  • Humans
  • Least-Squares Analysis*
  • Microarray Analysis*
  • Neoplasms / classification
  • Neoplasms / genetics
  • Neoplasms / metabolism
  • Pattern Recognition, Automated / methods