Valid Post-clustering Differential Analysis for Single-Cell RNA-Seq

Cell Syst. 2019 Oct 23;9(4):383-392.e6. doi: 10.1016/j.cels.2019.07.012. Epub 2019 Sep 11.

Abstract

Single-cell computational pipelines involve two critical steps: organizing cells (clustering) and identifying the markers driving this organization (differential expression analysis). State-of-the-art pipelines perform differential analysis after clustering on the same dataset. We observe that because clustering "forces" separation, reusing the same dataset generates artificially low p values and hence false discoveries. We introduce a valid post-clustering differential analysis framework, which corrects for this problem. We provide software at https://github.com/jessemzhang/tn_test.

Keywords: differential expression; p value; selective inference; single-cell RNA-seq.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Cluster Analysis
  • Computational Biology / methods*
  • Datasets as Topic
  • Gene Expression Profiling
  • Humans
  • Selection Bias
  • Sequence Analysis, RNA / methods*
  • Single-Cell Analysis / methods*
  • Software