Challenges in unsupervised clustering of single-cell RNA-seq data

Nat Rev Genet. 2019 May;20(5):273-282. doi: 10.1038/s41576-018-0088-9.


Single-cell RNA sequencing (scRNA-seq) allows researchers to collect large catalogues detailing the transcriptomes of individual cells. Unsupervised clustering is of central importance for the analysis of these data, as it is used to identify putative cell types. However, there are many challenges involved. We discuss why clustering is a challenging problem from a computational point of view and what aspects of the data make it challenging. We also consider the difficulties related to the biological interpretation and annotation of the identified clusters.

Publication types

  • Review

MeSH terms

  • Cell Lineage / genetics*
  • Cluster Analysis
  • Computational Biology / methods*
  • Epigenesis, Genetic
  • Eukaryotic Cells / classification
  • Eukaryotic Cells / cytology
  • Eukaryotic Cells / metabolism
  • Gene Expression Profiling
  • High-Throughput Nucleotide Sequencing / statistics & numerical data*
  • Humans
  • RNA, Messenger / chemistry
  • RNA, Messenger / genetics*
  • RNA, Messenger / metabolism
  • Single-Cell Analysis / methods
  • Single-Cell Analysis / statistics & numerical data*
  • Transcriptome*
  • Unsupervised Machine Learning


  • RNA, Messenger