The khmer software package: enabling efficient nucleotide sequence analysis
- PMID: 26535114
- PMCID: PMC4608353
- DOI: 10.12688/f1000research.6924.1
The khmer software package: enabling efficient nucleotide sequence analysis
Abstract
The khmer package is a freely available software library for working efficiently with fixed length DNA words, or k-mers. khmer provides implementations of a probabilistic k-mer counting data structure, a compressible De Bruijn graph representation, De Bruijn graph partitioning, and digital normalization. khmer is implemented in C++ and Python, and is freely available under the BSD license at https://github.com/dib-lab/khmer/.
Keywords: bioinformatics; dna sequencing analysis; k-mer; khmer; kmer; low-memory; online; streaming.
Conflict of interest statement
Similar articles
-
These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure.PLoS One. 2014 Jul 25;9(7):e101271. doi: 10.1371/journal.pone.0101271. eCollection 2014. PLoS One. 2014. PMID: 25062443 Free PMC article.
-
Squeakr: an exact and approximate k-mer counting system.Bioinformatics. 2018 Feb 15;34(4):568-575. doi: 10.1093/bioinformatics/btx636. Bioinformatics. 2018. PMID: 29444235
-
A space and time-efficient index for the compacted colored de Bruijn graph.Bioinformatics. 2018 Jul 1;34(13):i169-i177. doi: 10.1093/bioinformatics/bty292. Bioinformatics. 2018. PMID: 29949982 Free PMC article.
-
MegaGTA: a sensitive and accurate metagenomic gene-targeted assembler using iterative de Bruijn graphs.BMC Bioinformatics. 2017 Oct 16;18(Suppl 12):408. doi: 10.1186/s12859-017-1825-3. BMC Bioinformatics. 2017. PMID: 29072142 Free PMC article.
-
ntCard: a streaming algorithm for cardinality estimation in genomics data.Bioinformatics. 2017 May 1;33(9):1324-1330. doi: 10.1093/bioinformatics/btw832. Bioinformatics. 2017. PMID: 28453674 Free PMC article.
Cited by
-
Loss of multi-level 3D genome organization during breast cancer progression.bioRxiv [Preprint]. 2024 Aug 8:2023.11.26.568711. doi: 10.1101/2023.11.26.568711. bioRxiv. 2024. PMID: 38076897 Free PMC article. Preprint.
-
Characterization of the RNA-interference pathway as a tool for reverse genetic analysis in the nascent phototrophic endosymbiosis, Paramecium bursaria.R Soc Open Sci. 2021 Apr 21;8(4):210140. doi: 10.1098/rsos.210140. R Soc Open Sci. 2021. PMID: 33996132 Free PMC article.
-
The Degenerate Tale of Ascidian Tails.Integr Comp Biol. 2021 Sep 8;61(2):358-369. doi: 10.1093/icb/icab022. Integr Comp Biol. 2021. PMID: 33881514 Free PMC article. Review.
-
Draft Genome Sequence of Planomonospora sphaerica JCM9374, a Rare Actinomycete.Genome Announc. 2016 Aug 4;4(4):e00779-16. doi: 10.1128/genomeA.00779-16. Genome Announc. 2016. PMID: 27492001 Free PMC article.
-
Genomic analyses point to a low evolutionary potential of prospective source populations for assisted migration in a forest herb.Evol Appl. 2022 Oct 2;15(11):1859-1874. doi: 10.1111/eva.13485. eCollection 2022 Nov. Evol Appl. 2022. PMID: 36426124 Free PMC article.
References
-
- Brown CT, Howe A, Zhang Q, et al. : A reference-free algorithm for computational normalization of shotgun sequencing data. arXiv preprint.2012. Reference Source
-
- Zhang Q, Awad S, Brown CT: Crossing the streams: a framework for streaming analysis of short DNA sequencing reads. PeerJ PrePrints. 2015;3:e1100 10.7287/peerj.preprints.890v1 - DOI
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous
