Analysis of transposable element sequences using CENSOR and RepeatMasker

Methods Mol Biol. 2009;537:323-36. doi: 10.1007/978-1-59745-251-9_16.


Eukaryotic genomes are full of repetitive DNA, transposable elements (TEs) in particular, and accordingly there are a number of computational methods that can be used to identify TEs from genomic sequences. We present here a survey of two of the most readily available and widely used bioinformatics applications for the detection, characterization, and analysis of TE sequences in eukaryotic genomes: CENSOR and RepeatMasker. For each program, information on availability, input, output, and the algorithmic methods used is provided. Specific examples of the use of CENSOR and RepeatMasker are also described. CENSOR and RepeatMasker both rely on homology-based methods for the detection of TE sequences. There are several other classes of methods available for the analysis of repetitive DNA sequences including de novo methods that compare genomic sequences against themselves, class-specific methods that use structural characteristics of specific classes of elements to aid in their identification, and pipeline methods that combine aspects of some or all of the aforementioned methods. We briefly consider the strengths and weaknesses of these different classes of methods with an emphasis on their complementary utility for the analysis of repetitive DNA in eukaryotes.

MeSH terms

  • Base Sequence
  • Computational Biology
  • DNA Transposable Elements / genetics*
  • Molecular Sequence Data
  • Sequence Alignment
  • Sequence Analysis, DNA / methods*
  • Software*
  • User-Computer Interface


  • DNA Transposable Elements