The TIGR Plant Repeat Databases: a collective resource for the identification of repetitive sequences in plants

Shu Ouyang; C Robin Buell

doi:10.1093/nar/gkh099

The TIGR Plant Repeat Databases: a collective resource for the identification of repetitive sequences in plants

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D360-3. doi: 10.1093/nar/gkh099.

Authors

Shu Ouyang¹, C Robin Buell

Affiliation

¹ The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850, USA.

Abstract

In a number of higher plants, a substantial portion of the genome is composed of repetitive sequences that can hinder genome annotation and sequencing efforts. To better understand the nature of repetitive sequences in plants and provide a resource for identifying such sequences, we constructed databases of repetitive sequences for 12 plant genera: Arabidopsis, Brassica, Glycine, Hordeum, Lotus, Lycopersicon, Medicago, Oryza, Solanum, Sorghum, Triticum and Zea (www.tigr.org/tdb/e2k1/plant. repeats/index.shtml). The repetitive sequences within each database have been coded into super-classes, classes and sub-classes based on sequence and structure similarity. These databases are available for sequence similarity searches as well as downloadable files either as entire databases or subsets of each database. To further the utility for comparative studies and to provide a resource for searching for repetitive sequences in other genera within these families, repetitive sequences have been combined into four databases to represent the Brassicaceae, Fabaceae, Gramineae and Solanaceae families. Collectively, these databases provide a resource for the identification, classification and analysis of repetitive sequences in plants.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Computational Biology
DNA, Plant / genetics
Databases, Nucleic Acid*
Information Storage and Retrieval
Internet
Plants / classification
Plants / genetics*
Repetitive Sequences, Nucleic Acid*

Substances

DNA, Plant