StringTie enables improved reconstruction of a transcriptome from RNA-seq reads
- PMID: 25690850
- PMCID: PMC4643835
- DOI: 10.1038/nbt.3122
StringTie enables improved reconstruction of a transcriptome from RNA-seq reads
Abstract
Methods used to sequence the transcriptome often produce more than 200 million short sequences. We introduce StringTie, a computational method that applies a network flow algorithm originally developed in optimization theory, together with optional de novo assembly, to assemble these complex data sets into transcripts. When used to analyze both simulated and real data sets, StringTie produces more complete and accurate reconstructions of genes and better estimates of expression levels, compared with other leading transcript assembly programs including Cufflinks, IsoLasso, Scripture and Traph. For example, on 90 million reads from human blood, StringTie correctly assembled 10,990 transcripts, whereas the next best assembly was of 7,187 transcripts by Cufflinks, which is a 53% increase in transcripts assembled. On a simulated data set, StringTie correctly assembled 7,559 transcripts, which is 20% more than the 6,310 assembled by Cufflinks. As well as producing a more complete transcriptome assembly, StringTie runs faster on all data sets tested to date compared with other assembly software, including Cufflinks.
Conflict of interest statement
The authors declare no competing financial interests.
Figures
Similar articles
-
Improved transcriptome assembly using a hybrid of long and short reads with StringTie.PLoS Comput Biol. 2022 Jun 1;18(6):e1009730. doi: 10.1371/journal.pcbi.1009730. eCollection 2022 Jun. PLoS Comput Biol. 2022. PMID: 35648784 Free PMC article.
-
TransComb: genome-guided transcriptome assembly via combing junctions in splicing graphs.Genome Biol. 2016 Oct 19;17(1):213. doi: 10.1186/s13059-016-1074-1. Genome Biol. 2016. PMID: 27760567 Free PMC article.
-
STAble: a novel approach to de novo assembly of RNA-seq data and its application in a metabolic model network based metatranscriptomic workflow.BMC Bioinformatics. 2018 Jul 9;19(Suppl 7):184. doi: 10.1186/s12859-018-2174-6. BMC Bioinformatics. 2018. PMID: 30066630 Free PMC article.
-
Protocol for transcriptome assembly by the TransBorrow algorithm.Biol Methods Protoc. 2023 Nov 1;8(1):bpad028. doi: 10.1093/biomethods/bpad028. eCollection 2023. Biol Methods Protoc. 2023. PMID: 38023349 Free PMC article. Review.
-
Mapping RNA-seq reads to transcriptomes efficiently based on learning to hash method.Comput Biol Med. 2020 Jan;116:103539. doi: 10.1016/j.compbiomed.2019.103539. Epub 2019 Nov 13. Comput Biol Med. 2020. PMID: 31765913 Review.
Cited by
-
Effects of simulated microgravity on colorectal cancer organoids growth and drug response.Sci Rep. 2024 Oct 26;14(1):25526. doi: 10.1038/s41598-024-76737-8. Sci Rep. 2024. PMID: 39462078 Free PMC article.
-
Genome-wide identification of CAMTA genes and their expression dependence on light and calcium signaling during seedling growth and development in mung bean.BMC Genomics. 2024 Oct 23;25(1):992. doi: 10.1186/s12864-024-10893-z. BMC Genomics. 2024. PMID: 39443876 Free PMC article.
-
Inhibition of phospholipases suppresses progression of psoriasis through modulation of inflammation.Exp Biol Med (Maywood). 2021 Jun;246(11):1253-1262. doi: 10.1177/1535370221993424. Epub 2021 Feb 27. Exp Biol Med (Maywood). 2021. PMID: 33641447 Free PMC article.
-
Chromosome-scale genome assembly of Cucumis hystrix-a wild species interspecifically cross-compatible with cultivated cucumber.Hortic Res. 2021 Mar 1;8(1):40. doi: 10.1038/s41438-021-00475-5. Hortic Res. 2021. PMID: 33642577 Free PMC article.
-
Analysis of long non-coding RNA expression profiles in high-glucose treated vascular endothelial cells.BMC Endocr Disord. 2020 Jul 20;20(1):107. doi: 10.1186/s12902-020-00593-6. BMC Endocr Disord. 2020. PMID: 32689997 Free PMC article.
References
Publication types
MeSH terms
Substances
Associated data
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
