Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jan 15;33(2):235-242.
doi: 10.1093/bioinformatics/btw607. Epub 2016 Sep 23.

Robust classification of single-cell transcriptome data by nonnegative matrix factorization

Affiliations

Robust classification of single-cell transcriptome data by nonnegative matrix factorization

Chunxuan Shao et al. Bioinformatics. .

Abstract

Motivation: Single-cell transcriptome data provide unprecedented resolution to study heterogeneity in cell populations and present a challenge for unsupervised classification. Popular methods, like principal component analysis (PCA), often suffer from the high level of noise in the data.

Results: Here we adapt Nonnegative Matrix Factorization (NMF) to study the problem of identifying subpopulations in single-cell transcriptome data. In contrast to the conventional gene-centered view of NMF, identifying metagenes, we used NMF in a cell-centered direction, identifying cell subtypes ('metacells'). Using three different datasets (based on RT-qPCR and single cell RNA-seq data, respectively), we show that NMF outperforms PCA in identifying subpopulations in an accurate and robust way, without the need for prior feature selection; moreover, NMF successfully recovered the broad classes on a large dataset (thousands of single-cell transcriptomes), as identified by a computationally sophisticated method. NMF allows to identify feature genes in a direct, unbiased manner. We propose novel approaches for determining a biologically meaningful number of subpopulations based on minimizing the ambiguity of classification. In conclusion, our study shows that NMF is a robust, informative and simple method for the unsupervised learning of cell subtypes from single-cell gene expression data.

Availability and implementation: https://github.com/ccshao/nimfa CONTACTS: c.shao@Dkfz-Heidelberg.de or t.hoefer@Dkfz-Heidelberg.deSupplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Similar articles

Cited by