Single-Cell Omics for Transcriptome CHaracterization (SCOTCH): isoform-level characterization of gene expression through long-read single-cell RNA sequencing

bioRxiv [Preprint]. 2025 Feb 6:2024.04.29.590597. doi: 10.1101/2024.04.29.590597.

Abstract

Recent development involving long-read single-cell transcriptome sequencing (lr-scRNA-Seq) represents a significant leap forward in single-cell genomics. With the recent introduction of R10 flowcells by Oxford Nanopore, we propose that previous computational methods designed to handle high sequencing error rates are less relevant, and that the traditional approach using short reads to compile "barcode space" (candidate barcode list) to de-multiplex long reads are no longer necessary. Instead, computational methods should now shift focus on harnessing the unique benefits of long reads to analyze transcriptome complexity. In this context, we introduce a comprehensive suite of computational methods named Single-Cell Omics for Transcriptome CHaracterization (SCOTCH). SCOTCH supports both Nanopore and PacBio sequencing platforms, and is compatible with single-cell library preparation protocols from both 10X Genomics and Parse Biosciences. Through a sub-exon identification strategy with dynamic thresholding and read mapping scores, SCOTCH precisely aligns reads to known isoforms and discover novel isoforms, efficiently addressing ambiguous mapping challenges commonly encountered in long-read single-cell data. Comprehensive simulations and real data analyses across multiple platforms (including 10X Genomics and Parse Bioscience, paired with Illumina or Nanopore sequencing technologies with R9 and R10 flowcells, as well as PacBio sequencing) demonstrated that SCOTCH outperforms existing methods in mapping accuracy, quantification accuracy and novel isoform detection, while also uncovering novel biological insights on transcriptome complexity at the single-cell level.

Publication types

  • Preprint