Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization
- PMID: 28133571
- PMCID: PMC5251935
- DOI: 10.7717/peerj.2888
Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization
Abstract
Single-cell RNA-Sequencing (scRNA-Seq) is a fast-evolving technology that enables the understanding of biological processes at an unprecedentedly high resolution. However, well-suited bioinformatics tools to analyze the data generated from this new technology are still lacking. Here we investigate the performance of non-negative matrix factorization (NMF) method to analyze a wide variety of scRNA-Seq datasets, ranging from mouse hematopoietic stem cells to human glioblastoma data. In comparison to other unsupervised clustering methods including K-means and hierarchical clustering, NMF has higher accuracy in separating similar groups in various datasets. We ranked genes by their importance scores (D-scores) in separating these groups, and discovered that NMF uniquely identifies genes expressed at intermediate levels as top-ranked genes. Finally, we show that in conjugation with the modularity detection method FEM, NMF reveals meaningful protein-protein interaction modules. In summary, we propose that NMF is a desirable method to analyze heterogeneous single-cell RNA-Seq data. The NMF based subpopulation detection package is available at: https://github.com/lanagarmire/NMFEM.
Keywords: Clustering; Feature gene; Heterogeneity; Modularity; Non-negative matrix factorization; RNA-Seq; Single cell; Single cell sequencing; Single-cell; Subpopulation.
Conflict of interest statement
The authors declare there are no competing interests.
Figures
Similar articles
-
A robust semi-supervised NMF model for single cell RNA-seq data.PeerJ. 2020 Oct 16;8:e10091. doi: 10.7717/peerj.10091. eCollection 2020. PeerJ. 2020. PMID: 33088619 Free PMC article.
-
Unsupervised Cluster Analysis and Gene Marker Extraction of scRNA-seq Data Based On Non-Negative Matrix Factorization.IEEE J Biomed Health Inform. 2022 Jan;26(1):458-467. doi: 10.1109/JBHI.2021.3091506. Epub 2022 Jan 17. IEEE J Biomed Health Inform. 2022. PMID: 34156956
-
Joint learning dimension reduction and clustering of single-cell RNA-sequencing data.Bioinformatics. 2020 Jun 1;36(12):3825-3832. doi: 10.1093/bioinformatics/btaa231. Bioinformatics. 2020. PMID: 32246821
-
Inferring cellular and molecular processes in single-cell data with non-negative matrix factorization using Python, R and GenePattern Notebook implementations of CoGAPS.Nat Protoc. 2023 Dec;18(12):3690-3731. doi: 10.1038/s41596-023-00892-x. Epub 2023 Nov 21. Nat Protoc. 2023. PMID: 37989764 Free PMC article. Review.
-
Imaging data analysis using non-negative matrix factorization.Neurosci Res. 2022 Jun;179:51-56. doi: 10.1016/j.neures.2021.12.001. Epub 2021 Dec 22. Neurosci Res. 2022. PMID: 34953961 Review.
Cited by
-
Single-cell RNA sequencing for the study of development, physiology and disease.Nat Rev Nephrol. 2018 Aug;14(8):479-492. doi: 10.1038/s41581-018-0021-7. Nat Rev Nephrol. 2018. PMID: 29789704 Free PMC article. Review.
-
From bench to bedside: Single-cell analysis for cancer immunotherapy.Cancer Cell. 2021 Aug 9;39(8):1062-1080. doi: 10.1016/j.ccell.2021.07.004. Epub 2021 Jul 29. Cancer Cell. 2021. PMID: 34329587 Free PMC article. Review.
-
Interpretable factor models of single-cell RNA-seq via variational autoencoders.Bioinformatics. 2020 Jun 1;36(11):3418-3421. doi: 10.1093/bioinformatics/btaa169. Bioinformatics. 2020. PMID: 32176273 Free PMC article.
-
Using single-cell multiple omics approaches to resolve tumor heterogeneity.Clin Transl Med. 2017 Dec 28;6(1):46. doi: 10.1186/s40169-017-0177-y. Clin Transl Med. 2017. PMID: 29285690 Free PMC article. Review.
-
Exploring generative deep learning for omics data using log-linear models.Bioinformatics. 2020 Dec 22;36(20):5045-5053. doi: 10.1093/bioinformatics/btaa623. Bioinformatics. 2020. PMID: 32647888 Free PMC article.
References
-
- Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, Yefanov A, Lee H, Zhang N, Robertson CL, Serova N, Davis S, Soboleva A. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Research. 2013;41:D991–D995. doi: 10.1093/nar/gks1193. - DOI - PMC - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
