NEXT-scASV: A Nextflow Pipeline for Allele-Specific Variants Calling from single cell RNA-seq data

Gigascience. 2026 Apr 6:giag042. doi: 10.1093/gigascience/giag042. Online ahead of print.

Abstract

The rapid accumulation of single-cell sequencing data presents major computational challenges in reproducibility, scaling, and handling data characteristics like sparsity and technical variations, which complicate even basic analyses. The next level of complexity is the allele-specific analysis, focused on identifying differential gene expression or regulation between homologous chromosomes by estimating the allelic imbalance of read counts at individual single-nucleotide variants. Here we present NEXT-scASV, a scalable Nextflow pipeline for calling allele-specific variants (ASVs) from 5' single-cell RNA sequencing data. NEXT-scASV automates the entire process-from read alignment and quality control to variant calling and statistical evaluation of the allelic imbalance-within a containerized environment, ensuring reproducibility and ease of deployment across platforms. NEXT-scASV is able to perform de novo ASV detection from single-cell sequencing data without prior genotyping. Its modular design allows for massive parallelization, efficiently handling the scale of modern atlas-level studies. We validate NEXT-scASV on a dataset of 135,000 peripheral blood mononuclear cells (PBMCs) from 57 donors, demonstrating that it processes large-scale data efficiently, completing analysis in a week on a cluster with one node and 100 threads. Crucially, the pipeline reliably identifies ASVs even in rare cell populations (e.g., gdT GZMBhi and memory B IGHMhi cells), which remains elusive for bulk analyses. We also successfully detect allele-specific regulation of long non-coding RNAs and other lowly expressed, cell type-specific genes. Genes linked to detected ASVs show a high concordance (80%) with previously reported eQTLs. This strong validation confirms that NEXT-scASV produces biologically relevant results, making it a powerful tool for uncovering allele-specific regulation in large-scale, complex single-cell studies.

Keywords: Nextflow; allele-specific variants; computational pipeline; regulatory variants; single-cell RNA-seq.