A deep matrix factorization based approach for single-cell RNA-seq data clustering

Methods. 2022 Jun 28;205:114-122. doi: 10.1016/j.ymeth.2022.06.010. Online ahead of print.


The rapid development of single-cell sequencing technologies makes it possible to analyze cellular heterogeneity at the single-cell level. Cell clustering is one of the most fundamental and common steps in the heterogeneity analysis. However, due to the high noise level, high dimensionality and high sparsity, accurate cell clustering is still challengeable. Here, we present DeepCI, a new clustering approach for scRNA-seq data. Using two autoencoders to obtain cell embedding and gene embedding, DeepCI can simultaneously learn cell low-dimensional representation and clustering. In addition, the recovered gene expression matrix can be obtained by the matrix multiplication of cell and gene embedding. To evaluate the performance of DeepCI, we performed it on several real scRNA-seq datasets for clustering and visualization analysis. The experimental results show that DeepCI obtains the overall better performance than several popular single cell analysis methods. We also evaluated the imputation performance of DeepCI by a dedicated experiment. The corresponding results show that the imputed gene expression of known specific marker genes can greatly improve the accuracy of cell type classification.

Keywords: Autoencoder; Clustering; Matrix factorization; Single-cell RNA-seq.