Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Sep 25:4:900.
doi: 10.12688/f1000research.6924.1. eCollection 2015.

The khmer software package: enabling efficient nucleotide sequence analysis

Affiliations

The khmer software package: enabling efficient nucleotide sequence analysis

Michael R Crusoe et al. F1000Res. .

Abstract

The khmer package is a freely available software library for working efficiently with fixed length DNA words, or k-mers. khmer provides implementations of a probabilistic k-mer counting data structure, a compressible De Bruijn graph representation, De Bruijn graph partitioning, and digital normalization. khmer is implemented in C++ and Python, and is freely available under the BSD license at https://github.com/dib-lab/khmer/.

Keywords: bioinformatics; dna sequencing analysis; k-mer; khmer; kmer; low-memory; online; streaming.

PubMed Disclaimer

Conflict of interest statement

Competing interests: No competing interests were disclosed.

Similar articles

Cited by

References

    1. Zhang Q, Pell J, Canino-Koning R, et al. : These are not the k-mers you are looking for: Efficient online k-mer counting using a probabilistic data structure. PLoS One. 2014;9(7):e101271. 10.1371/journal.pone.0101271 - DOI - PMC - PubMed
    1. Pell J, Hintze A, Canino-Koning R, et al. : Scaling metagenome sequence assembly with probabilistic de Bruijn graphs. Proc Natl Acad Sci U S A. 2012;109(33):13272–7. 10.1073/pnas.1121464109 - DOI - PMC - PubMed
    1. Brown CT, Howe A, Zhang Q, et al. : A reference-free algorithm for computational normalization of shotgun sequencing data. arXiv preprint.2012. Reference Source
    1. Zhang Q, Awad S, Brown CT: Crossing the streams: a framework for streaming analysis of short DNA sequencing reads. PeerJ PrePrints. 2015;3:e1100 10.7287/peerj.preprints.890v1 - DOI
    1. Döring A, Weese D, Rausch T, et al. : SeqAn an efficient, generic C++ library for sequence analysis. BMC Bioinformatics. 2008;9(1):11. 10.1186/1471-2105-9-11 - DOI - PMC - PubMed