Pooling across cells to normalize single-cell RNA sequencing data with many zero counts

Genome Biol. 2016 Apr 27;17:75. doi: 10.1186/s13059-016-0947-7.


Normalization of single-cell RNA sequencing data is necessary to eliminate cell-specific biases prior to downstream analyses. However, this is not straightforward for noisy single-cell data where many counts are zero. We present a novel approach where expression values are summed across pools of cells, and the summed values are used for normalization. Pool-based size factors are then deconvolved to yield cell-based factors. Our deconvolution approach outperforms existing methods for accurate normalization of cell-specific biases in simulated data. Similar behavior is observed in real data, where deconvolution improves the relevance of results of downstream analyses.

Keywords: Differential expression; Normalization; Single-cell RNA-seq.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Calibration
  • Gene Expression Profiling / methods*
  • Gene Expression Profiling / standards
  • Humans
  • Sequence Analysis, RNA / methods*
  • Sequence Analysis, RNA / standards
  • Signal-To-Noise Ratio
  • Single-Cell Analysis / methods*
  • Single-Cell Analysis / standards