Simulating Single-Cell Gene Expression Count Data with Preserved Gene Correlations by scDesign2

J Comput Biol. 2022 Jan;29(1):23-26. doi: 10.1089/cmb.2021.0440. Epub 2022 Jan 11.

Abstract

scDesign2 is a transparent simulator that generates high-fidelity single-cell gene expression count data with gene correlations captured. This article shows how to download and install the scDesign2 R package, how to fit probabilistic models (one per cell type) to real data and simulate synthetic data from the fitted models, and how to use scDesign2 to guide experimental design and benchmark computational methods. Finally, a note is given about cell clustering as a preprocessing step before model fitting and data simulation.

Keywords: gene correlation; gene expression counts; simulator; single-cell RNA-seq.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Cluster Analysis
  • Computational Biology
  • Computer Simulation
  • Databases, Nucleic Acid / statistics & numerical data
  • Gene Expression
  • Gene Expression Profiling / statistics & numerical data*
  • Mice
  • Models, Statistical
  • RNA-Seq / statistics & numerical data
  • Single-Cell Analysis / statistics & numerical data*
  • Software*