Gossamer--a resource-efficient de novo assembler

Bioinformatics. 2012 Jul 15;28(14):1937-8. doi: 10.1093/bioinformatics/bts297. Epub 2012 May 18.

Abstract

Motivation: The de novo assembly of short read high-throughput sequencing data poses significant computational challenges. The volume of data is huge; the reads are tiny compared to the underlying sequence, and there are significant numbers of sequencing errors. There are numerous software packages that allow users to assemble short reads, but most are either limited to relatively small genomes (e.g. bacteria) or require large computing infrastructure or employ greedy algorithms and thus often do not yield high-quality results.

Results: We have developed Gossamer, an implementation of the de Bruijn approach to assembly that requires close to the theoretical minimum of memory, but still allows efficient processing. Our results show that it is space efficient and produces high-quality assemblies.

Availability: Gossamer is available for non-commercial use from http://www.genomics.csse.unimelb.edu.au/product-gossamer.php.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • High-Throughput Nucleotide Sequencing / methods*
  • Sequence Analysis, DNA / methods*
  • Software*