SIMD parallelization of the WORDUP algorithm for detecting statistically significant patterns in DNA sequences

Comput Appl Biosci. 1993 Dec;9(6):701-7. doi: 10.1093/bioinformatics/9.6.701.

Abstract

The development of new techniques in sequencing nuclei acids has produced a great amount of sequence data and has led to the discovery of new relationships. In this paper, we study a method for parallelizing the algorithm WORDUP, which detects the presence of statistically significant patterns in DNA sequences. WORDUP implements an efficient method to identify the presence of statistically significant oligomers in a non-homologous group of sequences. It is based on a modified version of the Boyer-Moore algorithm, which is one of the fastest algorithms for string matching available in the literature. The aim of the parallel version of WORDUP presented here is to speed up the computational time and allow the analysis of a greater set of longer nucleotide sequences, which is usually impractical with sequential algorithms.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Base Sequence
  • Computer Systems
  • DNA / genetics*
  • Pattern Recognition, Automated
  • Sequence Alignment / statistics & numerical data
  • Sequence Analysis, DNA / statistics & numerical data*
  • Sequence Homology, Nucleic Acid
  • Software Design
  • Software*

Substances

  • DNA