Discovery of rare cells from voluminous single cell expression data

Nat Commun. 2018 Nov 9;9(1):4719. doi: 10.1038/s41467-018-07234-6.


Single cell messenger RNA sequencing (scRNA-seq) provides a window into transcriptional landscapes in complex tissues. The recent introduction of droplet based transcriptomics platforms has enabled the parallel screening of thousands of cells. Large-scale single cell transcriptomics is advantageous as it promises the discovery of a number of rare cell sub-populations. Existing algorithms to find rare cells scale unbearably slowly or terminate, as the sample size grows to the order of tens of thousands. We propose Finder of Rare Entities (FiRE), an algorithm that, in a matter of seconds, assigns a rareness score to every individual expression profile under study. We demonstrate how FiRE scores can help bioinformaticians focus the downstream analyses only on a fraction of expression profiles within ultra-large scRNA-seq data. When applied to a large scRNA-seq dataset of mouse brain cells, FiRE recovered a novel sub-type of the pars tuberalis lineage.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Artificial Cells / metabolism
  • Brain / metabolism
  • Computer Simulation
  • Dendritic Cells / metabolism
  • Gene Expression Profiling*
  • HEK293 Cells
  • Humans
  • Jurkat Cells
  • Mice
  • Single-Cell Analysis / methods*
  • Time Factors