An exact algorithm to identify motifs in orthologous sequences from multiple species

Proc Int Conf Intell Syst Mol Biol. 2000:8:37-45.

Abstract

The identification of sequence motifs is a fundamental method for suggesting good candidates for biologically functional regions such as promoters, splice sites, binding sites, etc. We investigate the following approach to identifying motifs: given a collection of orthologous sequences from multiple species related by a known phylogenetic tree, search for motifs that are well conserved (according to a parsimony measure) in the species. We present an exact algorithm for solving this problem. We then discuss experimental results on finding promoters of the rbcS gene for a family of 10 plants, on finding promoters of the adh gene for 12 Drosophila species, and on finding promoters of several chloroplast encoded genes.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Animals
  • Base Sequence
  • Genome*
  • Molecular Sequence Data
  • Promoter Regions, Genetic
  • Sequence Analysis / methods*
  • Species Specificity