Graph embedding and Gaussian mixture variational autoencoder network for end-to-end analysis of single-cell RNA sequencing data

Cell Rep Methods. 2023 Jan 5;3(1):100382. doi: 10.1016/j.crmeth.2022.100382. eCollection 2023 Jan 23.

Abstract

Single-cell RNA sequencing (scRNA-seq) is a revolutionary technology to determine the precise gene expression of individual cells and identify cell heterogeneity and subpopulations. However, technical limitations of scRNA-seq lead to heterogeneous and sparse data. Here, we present autoCell, a deep-learning approach for scRNA-seq dropout imputation and feature extraction. autoCell is a variational autoencoding network that combines graph embedding and a probabilistic depth Gaussian mixture model to infer the distribution of high-dimensional, sparse scRNA-seq data. We validate autoCell on simulated datasets and biologically relevant scRNA-seq. We show that interpolation of autoCell improves the performance of existing tools in identifying cell developmental trajectories of human preimplantation embryos. We identify disease-associated astrocytes (DAAs) and reconstruct DAA-specific molecular networks and ligand-receptor interactions involved in cell-cell communications using Alzheimer's disease as a prototypical example. autoCell provides a toolbox for end-to-end analysis of scRNA-seq data, including visualization, clustering, imputation, and disease-specific gene network identification.

Keywords: Alzheimer’s disease; deep learning; disease-associated astrocyte; scRNA-seq; single cell/nuclei; variational autoencoding network.

Publication types

  • Research Support, N.I.H., Intramural
  • Research Support, N.I.H., Extramural

MeSH terms

  • Antiviral Agents*
  • Gene Regulatory Networks / genetics
  • Humans
  • Models, Statistical
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis* / methods

Substances

  • Antiviral Agents