Sequence alignment with tandem duplication

J Comput Biol. Fall 1997;4(3):351-67. doi: 10.1089/cmb.1997.4.351.

Abstract

Algorithm development for comparing and aligning biological sequences has, until recently, been based on the SI model of mutational events which assumes that modification of sequences proceeds through any of the operations of substitution, insertion or deletion (the latter two collectively termed indels). While this model has worked fairly well, it has long been apparent that other mutational events occur. In this paper, we introduce a new model, the DSI model which includes another common mutational event, tandem duplication. Tandem duplication produces tandem repeats which are common in DNA, making up perhaps 10% of the human genome. They are responsible for some human diseases and may serve a multitude of functions in DNA regulation and evolution. Using the DSI model, we develop new exact and heuristic algorithms for comparing and aligning DNA sequences when they contain tandem repeats.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Base Sequence
  • Computer Simulation*
  • DNA / chemistry
  • Humans
  • Models, Genetic*
  • Molecular Sequence Data
  • Mutagenesis*
  • Repetitive Sequences, Nucleic Acid*
  • Sequence Alignment / methods*

Substances

  • DNA

Associated data

  • GENBANK/U58989
  • GENBANK/U80896