An empirical Bayes method for differential expression analysis of single cells with deep generative models

Proc Natl Acad Sci U S A. 2023 May 23;120(21):e2209124120. doi: 10.1073/pnas.2209124120. Epub 2023 May 16.

Abstract

Detecting differentially expressed genes is important for characterizing subpopulations of cells. In scRNA-seq data, however, nuisance variation due to technical factors like sequencing depth and RNA capture efficiency obscures the underlying biological signal. Deep generative models have been extensively applied to scRNA-seq data, with a special focus on embedding cells into a low-dimensional latent space and correcting for batch effects. However, little attention has been paid to the problem of utilizing the uncertainty from the deep generative model for differential expression (DE). Furthermore, the existing approaches do not allow for controlling for effect size or the false discovery rate (FDR). Here, we present lvm-DE, a generic Bayesian approach for performing DE predictions from a fitted deep generative model, while controlling the FDR. We apply the lvm-DE framework to scVI and scSphere, two deep generative models. The resulting approaches outperform state-of-the-art methods at estimating the log fold change in gene expression levels as well as detecting differentially expressed genes between subpopulations of cells.

Keywords: deep generative modeling; differential expression; scRNA-seq.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Gene Expression Profiling / methods
  • RNA*
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis* / methods

Substances

  • RNA