An improved sequence assembly program

Genomics. 1996 Apr 1;33(1):21-31. doi: 10.1006/geno.1996.0155.


We describe a number of improvements to the CAP sequence assembly program. These improvements include the development of methods for solving the problem caused by simple repetitive sequences, for automatically editing fragment alignments and consensus sequences, and for identifying chimeric fragments. The improved program (CAP2) assembled each of seven data sets, six of which contain repetitive sequences of very strong similarity, into a single sequence. As an example, CAP2 assembled a set of 1467 fragments into a single sequence of 73,328 bp that has only eight differences from the original sequence. The effects of fragment length, coverage, and error rate on the performance of CAP2 were evaluated using artificial data sets.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Base Sequence*
  • Cosmids
  • Mice
  • Molecular Sequence Data*
  • Mycobacterium leprae
  • Repetitive Sequences, Nucleic Acid
  • Sequence Alignment
  • Software