Detecting seeded motifs in DNA sequences

Nucleic Acids Res. 2005 Sep 1;33(15):e135. doi: 10.1093/nar/gni131.

Abstract

The problem of detecting DNA motifs with functional relevance in real biological sequences is difficult due to a number of biological, statistical and computational issues and also because of the lack of knowledge about the structure of searched patterns. Many algorithms are implemented in fully automated processes, which are often based upon a guess of input parameters from the user at the very first step. In this paper, we present a novel method for the detection of seeded DNA motifs, composed by regions with a different extent of variability. The method is based on a multi-step approach, which was implemented in a motif searching web tool (MOST). Overrepresented exact patterns are extracted from input sequences and clustered to produce motifs core regions, which are then extended and scored to generate seeded motifs. The combination of automated pattern discovery algorithms and different display tools for the evaluation and selection of results at several analysis steps can potentially lead to much more meaningful results than complete automation can produce. Experimental results on different yeast and human real datasets proved the methodology to be a promising solution for finding seeded motifs. MOST web tool is freely available at http://telethon.bio.unipd.it/bioinfo/MOST.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • DNA, Fungal / chemistry
  • Humans
  • Internet
  • Promoter Regions, Genetic
  • Regulatory Sequences, Nucleic Acid*
  • Sequence Analysis, DNA / methods*
  • Software*
  • Yeasts / genetics

Substances

  • DNA, Fungal