GMM-Demux: sample demultiplexing, multiplet detection, experiment planning, and novel cell-type verification in single cell sequencing

Genome Biol. 2020 Jul 30;21(1):188. doi: 10.1186/s13059-020-02084-2.

Abstract

Identifying and removing multiplets are essential to improving the scalability and the reliability of single cell RNA sequencing (scRNA-seq). Multiplets create artificial cell types in the dataset. We propose a Gaussian mixture model-based multiplet identification method, GMM-Demux. GMM-Demux accurately identifies and removes multiplets through sample barcoding, including cell hashing and MULTI-seq. GMM-Demux uses a droplet formation model to authenticate putative cell types discovered from a scRNA-seq dataset. We generate two in-house cell-hashing datasets and compared GMM-Demux against three state-of-the-art sample barcoding classifiers. We show that GMM-Demux is stable and highly accurate and recognizes 9 multiplet-induced fake cell types in a PBMC dataset.

Keywords: Demultiplex; Multiplets; Phony cell type; Rare cell type; Sample barcoding; Single cell RNA.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural
  • Validation Study

MeSH terms

  • Bayes Theorem
  • Humans
  • Molecular Typing / methods*
  • Sequence Analysis, RNA*
  • Single-Cell Analysis*