CoCoA-diff: counterfactual inference for single-cell gene expression analysis

Genome Biol. 2021 Aug 17;22(1):228. doi: 10.1186/s13059-021-02438-4.


Finding a causal gene is a fundamental problem in genomic medicine. We present a causal inference framework, CoCoA-diff, that prioritizes disease genes by adjusting confounders without prior knowledge of control variables in single-cell RNA-seq data. We demonstrate that our method substantially improves statistical power in simulations and real-world data analysis of 70k brain cells collected for dissecting Alzheimer's disease. We identify 215 differentially regulated causal genes in various cell types, including highly relevant genes with a proper cell type context. Genes found in different types enrich distinctive pathways, implicating the importance of cell types in understanding multifaceted disease mechanisms.

Keywords: Alzheimer’s disease; Causal inference; Counterfactual inference; Single-cell RNA-seq.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alzheimer Disease / genetics
  • Brain
  • Causality
  • Gene Expression*
  • Genetic Techniques*
  • Genomic Medicine
  • Humans
  • Models, Statistical
  • RNA-Seq
  • Single-Cell Analysis*
  • Transcriptome