SeqLib: a C ++ API for rapid BAM manipulation, sequence alignment and sequence assembly

Bioinformatics. 2017 Mar 1;33(5):751-753. doi: 10.1093/bioinformatics/btw741.

Abstract

We present SeqLib, a C ++ API and command line tool that provides a rapid and user-friendly interface to BAM/SAM/CRAM files, global sequence alignment operations and sequence assembly. Four C libraries perform core operations in SeqLib: HTSlib for BAM access, BWA-MEM and BLAT for sequence alignment and Fermi for error correction and sequence assembly. Benchmarking indicates that SeqLib has lower CPU and memory requirements than leading C ++ sequence analysis APIs. We demonstrate an example of how minimal SeqLib code can extract, error-correct and assemble reads from a CRAM file and then align with BWA-MEM. SeqLib also provides additional capabilities, including chromosome-aware interval queries and read plotting. Command line tools are available for performing integrated error correction, micro-assemblies and alignment.

Availability and implementation: SeqLib is available on Linux and OSX for the C ++98 standard and later at github.com/walaj/SeqLib. SeqLib is released under the Apache2 license. Additional capabilities for BLAT alignment are available under the BLAT license.

Contact: jwala@broadinstitue.org ; rameen@broadinstitute.org.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosomes
  • High-Throughput Nucleotide Sequencing / methods
  • Sequence Alignment
  • Sequence Analysis, DNA / methods*
  • Software*