Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jan;14(1):68-70.
doi: 10.1038/nmeth.4078. Epub 2016 Nov 21.

TACO produces robust multisample transcriptome assemblies from RNA-seq

Affiliations

TACO produces robust multisample transcriptome assemblies from RNA-seq

Yashar S Niknafs et al. Nat Methods. 2017 Jan.

Abstract

Accurate transcript structure and abundance inference from RNA sequencing (RNA-seq) data is foundational for molecular discovery. Here we present TACO, a computational method to reconstruct a consensus transcriptome from multiple RNA-seq data sets. TACO employs novel change-point detection to demarcate transcript start and end sites, leading to improved reconstruction accuracy compared with other tools in its class. The tool is available at http://tacorna.github.io and can be readily incorporated into RNA-seq analysis workflows.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic detailing the transcriptome meta-assembly workflow for TACO. Reads are initially aligned to the genome. Ab initio assembly is then performed, generating a transcriptome assembly for each input sample. These transcriptome assemblies are then merged into a meta-assembly using the TACO tool, which leverages change point detection and a dynamic programming algorithm to generate robust transcript isoforms from the underlying network of splicing patterns.
Figure 2
Figure 2
Assessment of TACO performance. a. Performance metrics for TACO, Cuffmerge, and Stringtie when merging different numbers of input assemblies. Recall (i.e., sensitivity), precision and the F-measure for all three tools were assessed for splicing patterns, splice junctions, and bases. Points represent the mean statistic across the 20 runs, error bars represent the 95% confidence interval. (Data to make this panel can be found in Supplementary Table 3) b,c. Precision-recall plots (left) and bar plots depicting the average precision (right) depicting performance for the three tools merging 55 CCLE breast cancer cell lines (a) at 50 different isoform fraction cutoffs ranging from 0.001-0.999, and (b) for the highest expressed transcripts in the meta-assemblies. Points represent statistics for the top N transcripts, with N ranging from 500-30,000. (Data to make panels b and c can be found in Supplementary Tables 4 and 6, respectively).
Figure 3
Figure 3
Examples of TACO performance. The 3p21 genomic locus is depicted. Assembly of the genes SLC26A6, CELSR3, and NCKIPSD are shown for 50, 100, and 500 samples merged. The assembly produced by Cuffmerge is shown in blue, Stringtie in orange, and TACO in green. The Refseq reference annotation is shown above in red.

Similar articles

Cited by

References

    1. Djebali S, et al. Landscape of transcription in human cells. Nature. 2012;489:101–8. - PMC - PubMed
    1. Mercer TR, et al. Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Nat. Biotechnol. 2011;30:99–104. - PMC - PubMed
    1. Iyer MK, et al. The landscape of long noncoding RNAs in the human transcriptome. Nat Genet. 2015;47:199–208. - PMC - PubMed
    1. Harrow J, et al. GENCODE: The reference human genome annotation for the ENCODE project. Genome Res. 2012;22:1760–1774. - PMC - PubMed
    1. Pruitt KD, et al. RefSeq: An update on mammalian reference sequences. Nucleic Acids Res. 2014;42:756–763. - PMC - PubMed

Methods Only References

    1. Dobin A, et al. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. - PMC - PubMed
    1. Zhu M. Recall, precision and average precision. Dep. Stat. Actuar. Sci. …. 2004:1–11. at < http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Recall+,+P...>.
    1. Sharon D, Tilgner H, Grubert F, Snyder M. A single-molecule long-read survey of the human transcriptome. Nat. Biotechnol. 2013;31:1009–14. - PMC - PubMed
    1. Tilgner H, et al. Comprehensive transcriptome analysis using synthetic long-read sequencing reveals molecular co-association of distant splicing events. Nat. Biotechnol. 2015;33:736–742. - PMC - PubMed
    1. Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotech. 2016;34:525–527. - PubMed

Publication types