Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 28;38(9):2642-2644.
doi: 10.1093/bioinformatics/btac141.

scGate: marker-based purification of cell types from heterogeneous single-cell RNA-seq datasets

Affiliations

scGate: marker-based purification of cell types from heterogeneous single-cell RNA-seq datasets

Massimo Andreatta et al. Bioinformatics. .

Abstract

Summary: A common bioinformatics task in single-cell data analysis is to purify a cell type or cell population of interest from heterogeneous datasets. Here, we present scGate, an algorithm that automatizes marker-based purification of specific cell populations, without requiring training data or reference gene expression profiles. scGate purifies a cell population of interest using a set of markers organized in a hierarchical structure, akin to gating strategies employed in flow cytometry. scGate outperforms state-of-the-art single-cell classifiers and it can be applied to multiple modalities of single-cell data (e.g. RNA-seq, ATAC-seq, CITE-seq). scGate is implemented as an R package and integrated with the Seurat framework, providing an intuitive tool to isolate cell populations of interest from heterogeneous single-cell datasets.

Availability and implementation: scGate is available as an R package at https://github.com/carmonalab/scGate (https://doi.org/10.5281/zenodo.6202614). Several reproducible workflows describing the main functions and usage of the package on different single-cell modalities, as well as the code to reproduce the benchmark, can be found at https://github.com/carmonalab/scGate.demo (https://doi.org/10.5281/zenodo.6202585) and https://github.com/carmonalab/scGate.benchmark. Test data are available at https://doi.org/10.6084/m9.figshare.16826071.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Purifying cell populations from single-cell datasets using scGate. (A) Uniform Manifold Approximation and Projection (UMAP) representation of scRNA-seq data of PBMC populations annotated by Hao et al. (2021) (B) Purification of target cell types using scGate, for B cells on the left (using marker MS4A1 [encoding CD20]) and NK on the right (using NCAM [encoding CD56] and KLRD1 as positive markers, and CD3D as a negative marker). The violin plots display normalized ADT counts for the indicated proteins on the same cells. Precision (PREC), recall (REC) and MCC are shown. (C) UMAP representation of scRNA-seq data of melanoma tumors annotated by Jerby-Arnon et al. (2018) (D) Purification of macrophages using a hierarchical GM: immune cells at the first level (left panel) and macrophages at the second level (middle panel). Macrophage gene signature (UCell) scores are shown in the right panel. (E) scGate purification of monocytes using DNA accessibility of a PBMC 10× multiomics dataset. Violin plots display coupled RNA expression values. Gene-associated accessibility values were inferred using Signac (Stuart et al., 2021). (F) PREC (Positive Predictive Value) and MCC values for five publicly available scRNA-seq datasets (derived from blood or tumors) for scGate and three other cell type classifiers

Similar articles

Cited by

References

    1. Abdelaal T. et al. (2019) A comparison of automatic cell identification methods for single-cell RNA sequencing data. Genome Biol., 20, 194. - PMC - PubMed
    1. Andreatta M., Carmona S.J. (2021) UCell: robust and scalable single-cell gene signature scoring. Comput. Struct. Biotechnol. J., 19, 3796–3798. - PMC - PubMed
    1. Aran D. et al. (2019) Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat. Immunol., 20, 163–172. - PMC - PubMed
    1. Hao Y. et al. (2021) Integrated analysis of multimodal single-cell data. Cell, 184, 3573–3573. - PMC - PubMed
    1. Huang Q. et al. (2021) Evaluation of cell type annotation R packages on single-cell RNA-seq data. Genomics Proteomics Bioinformatics, 19, 267–281. - PMC - PubMed

Publication types