TieBrush: an efficient method for aggregating and summarizing mapped reads across large datasets
- PMID: 33964128
- PMCID: PMC8545345
- DOI: 10.1093/bioinformatics/btab342
TieBrush: an efficient method for aggregating and summarizing mapped reads across large datasets
Abstract
Summary: Although the ability to programmatically summarize and visually inspect sequencing data is an integral part of genome analysis, currently available methods are not capable of handling large numbers of samples. In particular, making a visual comparison of transcriptional landscapes between two sets of thousands of RNA-seq samples is limited by available computational resources, which can be overwhelmed due to the sheer size of the data. In this work, we present TieBrush, a software package designed to process very large sequencing datasets (RNA, whole-genome, exome, etc.) into a form that enables quick visual and computational inspection. TieBrush can also be used as a method for aggregating data for downstream computational analysis, and is compatible with most software tools that take aligned reads as input.
Availability and implementation: TieBrush is provided as a C++ package under the MIT License. Precompiled binaries, source code and example data are available on GitHub (https://github.com/alevar/tiebrush).
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Figures
Similar articles
-
RNA-SeQC 2: efficient RNA-seq quality control and quantification for large cohorts.Bioinformatics. 2021 Sep 29;37(18):3048-3050. doi: 10.1093/bioinformatics/btab135. Bioinformatics. 2021. PMID: 33677499 Free PMC article.
-
SNIKT: sequence-independent adapter identification and removal in long-read shotgun sequencing data.Bioinformatics. 2022 Aug 2;38(15):3830-3832. doi: 10.1093/bioinformatics/btac389. Bioinformatics. 2022. PMID: 35695743 Free PMC article.
-
Simulating Illumina metagenomic data with InSilicoSeq.Bioinformatics. 2019 Feb 1;35(3):521-522. doi: 10.1093/bioinformatics/bty630. Bioinformatics. 2019. PMID: 30016412 Free PMC article.
-
Efficient population-scale variant analysis and prioritization with VAPr.Bioinformatics. 2018 Aug 15;34(16):2843-2845. doi: 10.1093/bioinformatics/bty192. Bioinformatics. 2018. PMID: 29659724 Free PMC article.
-
Large scale microbiome profiling in the cloud.Bioinformatics. 2019 Jul 15;35(14):i13-i22. doi: 10.1093/bioinformatics/btz356. Bioinformatics. 2019. PMID: 31510682 Free PMC article.
Cited by
-
CONSERVATION ASSESSMENT OF HUMAN SPLICE SITE ANNOTATION BASED ON A 470-GENOME ALIGNMENT.bioRxiv [Preprint]. 2024 May 14:2023.12.01.569581. doi: 10.1101/2023.12.01.569581. bioRxiv. 2024. PMID: 38076842 Free PMC article. Preprint.
-
Splam: a deep-learning-based splice site predictor that improves spliced alignments.bioRxiv [Preprint]. 2023 Jul 29:2023.07.27.550754. doi: 10.1101/2023.07.27.550754. bioRxiv. 2023. Update in: Genome Biol. 2024 Sep 16;25(1):243. doi: 10.1186/s13059-024-03379-4 PMID: 37546880 Free PMC article. Updated. Preprint.
-
Structure-guided isoform identification for the human transcriptome.Elife. 2022 Dec 15;11:e82556. doi: 10.7554/eLife.82556. Elife. 2022. PMID: 36519529 Free PMC article.
-
CHESS 3: an improved, comprehensive catalog of human genes and transcripts based on large-scale expression data, phylogenetic analysis, and protein structure.Genome Biol. 2023 Oct 30;24(1):249. doi: 10.1186/s13059-023-03088-4. Genome Biol. 2023. PMID: 37904256 Free PMC article.
-
Splam: a deep-learning-based splice site predictor that improves spliced alignments.Genome Biol. 2024 Sep 16;25(1):243. doi: 10.1186/s13059-024-03379-4. Genome Biol. 2024. PMID: 39285451 Free PMC article.
References
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
