Single-cell RNA sequencing (scRNA-seq) can characterize cell types and states through unsupervised clustering, but the ever increasing number of cells and batch effect impose computational challenges. We present DESC, an unsupervised deep embedding algorithm that clusters scRNA-seq data by iteratively optimizing a clustering objective function. Through iterative self-learning, DESC gradually removes batch effects, as long as technical differences across batches are smaller than true biological variations. As a soft clustering algorithm, cluster assignment probabilities from DESC are biologically interpretable and can reveal both discrete and pseudotemporal structure of cells. Comprehensive evaluations show that DESC offers a proper balance of clustering accuracy and stability, has a small footprint on memory, does not explicitly require batch information for batch effect removal, and can utilize GPU when available. As the scale of single-cell studies continues to grow, we believe DESC will offer a valuable tool for biomedical researchers to disentangle complex cellular heterogeneity.
Conflict of interest statement
The authors declare no competing interests.
BERMUDA: a novel deep transfer learning method for single-cell RNA sequencing batch correction reveals hidden high-resolution cellular subtypes.Genome Biol. 2019 Aug 12;20(1):165. doi: 10.1186/s13059-019-1764-6. Genome Biol. 2019. PMID: 31405383 Free PMC article.
Machine learning and statistical methods for clustering single-cell RNA-sequencing data.Brief Bioinform. 2019 Jun 27:bbz063. doi: 10.1093/bib/bbz063. Online ahead of print. Brief Bioinform. 2019. PMID: 31243426
Joint learning dimension reduction and clustering of single-cell RNA-sequencing data.Bioinformatics. 2020 Apr 4:btaa231. doi: 10.1093/bioinformatics/btaa231. Online ahead of print. Bioinformatics. 2020. PMID: 32246821
Single-Cell RNA-Seq Technologies and Related Computational Data Analysis.Front Genet. 2019 Apr 5;10:317. doi: 10.3389/fgene.2019.00317. eCollection 2019. Front Genet. 2019. PMID: 31024627 Free PMC article. Review.
Challenges in unsupervised clustering of single-cell RNA-seq data.Nat Rev Genet. 2019 May;20(5):273-282. doi: 10.1038/s41576-018-0088-9. Nat Rev Genet. 2019. PMID: 30617341 Review.
- R01EY030192/U.S. Department of Health & Human Services | NIH | National Eye Institute (NEI)
- R01GM125301/U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
- R01 GM125301/GM/NIGMS NIH HHS/United States
- R01 GM108600/GM/NIGMS NIH HHS/United States
- R01 EY030192/EY/NEI NIH HHS/United States
- R01GM108600/U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
- R01 HL150359/HL/NHLBI NIH HHS/United States
- R01HL150359/U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL113147/HL/NHLBI NIH HHS/United States
- R01HL113147/U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)