Spectral Repeat Finder (SRF): identification of repetitive sequences using Fourier transformation

Bioinformatics. 2004 Jun 12;20(9):1405-12. doi: 10.1093/bioinformatics/bth103. Epub 2004 Feb 19.

Abstract

Motivation: Repetitive DNA sequences, besides having a variety of regulatory functions, are one of the principal causes of genomic instability. Understanding their origin and evolution is of fundamental importance for genome studies. The identification of repeats and their units helps in deducing the intra-genomic dynamics as an important feature of comparative genomics. A major difficulty in identification of repeats arises from the fact that the repeat units can be either exact or imperfect, in tandem or dispersed, and of unspecified length.

Results: The Spectral Repeat Finder program circumvents these problems by using a discrete Fourier transformation to identify significant periodicities present in a sequence. The specific regions of the sequence that contribute to a given periodicity are located through a sliding window analysis, and an exact search method is then used to find the repetitive units. Efficient and complete detection of repeats is provided together with interactive and detailed visualization of the spectral analysis of input sequence. We demonstrate the utility of our method with various examples that contain previously unannotated repeats. A Web server has been developed for convenient access to the automated program.

Availability: The Web server is available at http://www.imtech.res.in/raghava/srf and http://www2.imtech.res.in/raghava/srf

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms*
  • DNA / analysis
  • DNA / chemistry*
  • DNA / genetics*
  • Fourier Analysis
  • Pattern Recognition, Automated / methods
  • Periodicity
  • Repetitive Sequences, Nucleic Acid / genetics*
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*
  • Sequence Homology, Nucleic Acid

Substances

  • DNA