RNA-Bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes
- PMID: 32817073
- PMCID: PMC7462077
- DOI: 10.1101/gr.260174.119
RNA-Bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes
Abstract
Despite the rapid advance in single-cell RNA sequencing (scRNA-seq) technologies within the last decade, single-cell transcriptome analysis workflows have primarily used gene expression data while isoform sequence analysis at the single-cell level still remains fairly limited. Detection and discovery of isoforms in single cells is difficult because of the inherent technical shortcomings of scRNA-seq data, and existing transcriptome assembly methods are mainly designed for bulk RNA samples. To address this challenge, we developed RNA-Bloom, an assembly algorithm that leverages the rich information content aggregated from multiple single-cell transcriptomes to reconstruct cell-specific isoforms. Assembly with RNA-Bloom can be either reference-guided or reference-free, thus enabling unbiased discovery of novel isoforms or foreign transcripts. We compared both assembly strategies of RNA-Bloom against five state-of-the-art reference-free and reference-based transcriptome assembly methods. In our benchmarks on a simulated 384-cell data set, reference-free RNA-Bloom reconstructed 37.9%-38.3% more isoforms than the best reference-free assembler, whereas reference-guided RNA-Bloom reconstructed 4.1%-11.6% more isoforms than reference-based assemblers. When applied to a real 3840-cell data set consisting of more than 4 billion reads, RNA-Bloom reconstructed 9.7%-25.0% more isoforms than the best competing reference-based and reference-free approaches evaluated. We expect RNA-Bloom to boost the utility of scRNA-seq data beyond gene expression analysis, expanding what is informatically accessible now.
© 2020 Nip et al.; Published by Cold Spring Harbor Laboratory Press.
Figures
Similar articles
-
ClusTrast: a short read de novo transcript isoform assembler guided by clustered contigs.BMC Bioinformatics. 2024 Feb 1;25(1):54. doi: 10.1186/s12859-024-05663-3. BMC Bioinformatics. 2024. PMID: 38302873 Free PMC article.
-
TransBorrow: genome-guided transcriptome assembly by borrowing assemblies from different assemblers.Genome Res. 2020 Aug;30(8):1181-1190. doi: 10.1101/gr.257766.119. Epub 2020 Aug 17. Genome Res. 2020. PMID: 32817072 Free PMC article.
-
Accurate inference of isoforms from multiple sample RNA-Seq data.BMC Genomics. 2015;16 Suppl 2(Suppl 2):S15. doi: 10.1186/1471-2164-16-S2-S15. Epub 2015 Jan 21. BMC Genomics. 2015. PMID: 25708199 Free PMC article.
-
Identifying cell types to interpret scRNA-seq data: how, why and more possibilities.Brief Funct Genomics. 2020 Jul 29;19(4):286-291. doi: 10.1093/bfgp/elaa003. Brief Funct Genomics. 2020. PMID: 32232401 Review.
-
Mapping RNA-seq reads to transcriptomes efficiently based on learning to hash method.Comput Biol Med. 2020 Jan;116:103539. doi: 10.1016/j.compbiomed.2019.103539. Epub 2019 Nov 13. Comput Biol Med. 2020. PMID: 31765913 Review.
Cited by
-
Construction of an immune predictive model and identification of TRIP6 as a prognostic marker and therapeutic target of CRC by integration of single-cell and bulk RNA-seq data.Cancer Immunol Immunother. 2024 Mar 2;73(4):69. doi: 10.1007/s00262-024-03658-w. Cancer Immunol Immunother. 2024. PMID: 38430268 Free PMC article.
-
Chromosome-level genome assembly of the silver pomfret Pampus argenteus.Sci Data. 2024 Feb 23;11(1):234. doi: 10.1038/s41597-024-03070-0. Sci Data. 2024. PMID: 38395996 Free PMC article.
-
ClusTrast: a short read de novo transcript isoform assembler guided by clustered contigs.BMC Bioinformatics. 2024 Feb 1;25(1):54. doi: 10.1186/s12859-024-05663-3. BMC Bioinformatics. 2024. PMID: 38302873 Free PMC article.
-
cloudrnaSPAdes: isoform assembly using bulk barcoded RNA sequencing data.Bioinformatics. 2024 Feb 1;40(2):btad781. doi: 10.1093/bioinformatics/btad781. Bioinformatics. 2024. PMID: 38262343 Free PMC article.
-
cloudrnaSPAdes: Isoform assembly using bulk barcoded RNA sequencing data.bioRxiv [Preprint]. 2023 Jul 27:2023.07.25.550587. doi: 10.1101/2023.07.25.550587. bioRxiv. 2023. PMID: 37546844 Free PMC article. Updated. Preprint.
References
-
- Bloom BH. 1970. Space/time trade-offs in hash coding with allowable errors. Commun ACM 13: 422–426. 10.1145/362686.362692 - DOI
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources