Assembly algorithms for deep sequencing data: basics and pitfalls

Methods Mol Biol. 2013;1038:81-91. doi: 10.1007/978-1-62703-514-9_5.


Our ability to sequence the genomic data at our disposal is limited. At each experiment we can sequence reliably only a short fraction of even the smallest genome. We are then faced with the challenge of assembly-combining the short patches we have into a correct reconstruction of as large as possible a fragment of the original sample. The problem has been thoroughly researched and many commercial and academic tools exist to carry it out. However due to basic features of the problem the results of even our best efforts will be sometimes disappointing for the researcher. In this chapter we will try to explain why the assembly problem is so hard, what future directions may alleviate it in the near future, and what can be realistically expected from a current assembly experiment.

MeSH terms

  • Algorithms*
  • Animals
  • Computer Graphics
  • Genomics / methods
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Sequence Analysis, DNA / methods