scSemiGAN: a single-cell semi-supervised annotation and dimensionality reduction framework based on generative adversarial network

Bioinformatics. 2022 Nov 15;38(22):5042-5048. doi: 10.1093/bioinformatics/btac652.

Abstract

Motivation: Cell-type annotation plays a crucial role in single-cell RNA-seq (scRNA-seq) data analysis. As more and more well-annotated scRNA-seq reference data are publicly available, automatical label transference algorithms are gaining popularity over manual marker gene-based annotation methods. However, most existing methods fail to unify cell-type annotation with dimensionality reduction and are unable to generate deep latent representation from the perspective of data generation.

Results: In this article, we propose scSemiGAN, a single-cell semi-supervised cell-type annotation and dimensionality reduction framework based on a generative adversarial network, to overcome these challenges, modeling scRNA-seq data from the aspect of data generation. Our proposed scSemiGAN is capable of performing deep latent representation learning and cell-type label prediction simultaneously. Through extensive comparison with four state-of-the-art annotation methods on diverse simulated and real scRNA-seq datasets, scSemiGAN achieves competitive or superior performance in multiple downstream tasks including cell-type annotation, latent representation visualization, confounding factor removal and enrichment analysis.

Availability and implementation: The code and data of scSemiGAN are available on GitHub: https://github.com/rafa-nadal/scSemiGAN.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Algorithms*
  • Data Analysis
  • Gene Expression Profiling / methods
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis* / methods
  • Whole Exome Sequencing