Background: Many functional analysis tools have been developed to extract functional and mechanistic insight from bulk transcriptome data. With the advent of single-cell RNA sequencing (scRNA-seq), it is in principle possible to do such an analysis for single cells. However, scRNA-seq data has characteristics such as drop-out events and low library sizes. It is thus not clear if functional TF and pathway analysis tools established for bulk sequencing can be applied to scRNA-seq in a meaningful way.
Results: To address this question, we perform benchmark studies on simulated and real scRNA-seq data. We include the bulk-RNA tools PROGENy, GO enrichment, and DoRothEA that estimate pathway and transcription factor (TF) activities, respectively, and compare them against the tools SCENIC/AUCell and metaVIPER, designed for scRNA-seq. For the in silico study, we simulate single cells from TF/pathway perturbation bulk RNA-seq experiments. We complement the simulated data with real scRNA-seq data upon CRISPR-mediated knock-out. Our benchmarks on simulated and real data reveal comparable performance to the original bulk data. Additionally, we show that the TF and pathway activities preserve cell type-specific variability by analyzing a mixture sample sequenced with 13 scRNA-seq protocols. We also provide the benchmark data for further use by the community.
Conclusions: Our analyses suggest that bulk-based functional analysis tools that use manually curated footprint gene sets can be applied to scRNA-seq data, partially outperforming dedicated single-cell tools. Furthermore, we find that the performance of functional analysis tools is more sensitive to the gene sets than to the statistic used.
Keywords: Benchmark; Functional analysis; Pathway analysis; Transcription factor analysis; scRNA-seq.
Conflict of interest statement
The authors declare that they have no competing interests.
Detection of high variability in gene expression from single-cell RNA-seq profiling.BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):508. doi: 10.1186/s12864-016-2897-6. BMC Genomics. 2016. PMID: 27556924 Free PMC article.
Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing.Front Genet. 2020 Jan 17;10:1331. doi: 10.3389/fgene.2019.01331. eCollection 2019. Front Genet. 2020. PMID: 32010190 Free PMC article.
Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database.PLoS Comput Biol. 2018 Jun 25;14(6):e1006245. doi: 10.1371/journal.pcbi.1006245. eCollection 2018 Jun. PLoS Comput Biol. 2018. PMID: 29939984 Free PMC article.
Single-Cell RNA-Seq Technologies and Related Computational Data Analysis.Front Genet. 2019 Apr 5;10:317. doi: 10.3389/fgene.2019.00317. eCollection 2019. Front Genet. 2019. PMID: 31024627 Free PMC article. Review.
Emergence of Bias During the Synthesis and Amplification of cDNA for scRNA-seq.Adv Exp Med Biol. 2018;1068:149-158. doi: 10.1007/978-981-13-0502-3_12. Adv Exp Med Biol. 2018. PMID: 29943302 Review.