Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Jan 1;33(Database issue):D86-90.
doi: 10.1093/nar/gki097.

DoOP: Databases of Orthologous Promoters, Collections of Clusters of Orthologous Upstream Sequences From Chordates and Plants

Free PMC article

DoOP: Databases of Orthologous Promoters, Collections of Clusters of Orthologous Upstream Sequences From Chordates and Plants

Endre Barta et al. Nucleic Acids Res. .
Free PMC article


DoOP ( is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21,061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically.


Figure 1
Figure 1
The data flow of the generation of the chordate DoOP database. The same method is used in the case of the plant DoOP database, except the source BLAST database comes from all Viridiplantae sequences and the query sequences are generated based on the NCBI A.thaliana annotation.
Figure 2
Figure 2
Different types of genes according to the positions of the first mRNA and coding (cds) exons. The types 5 and 6 fall into different subcategories based on the number of the first coding exon. If it is the second as in this figure, we call it type 52 or 62, but otherwise we are referring to them generally as 5n or 6n. The positions of the query sequences relative to the first exons are marked with green boxes, while the 500, 1000 and 3000 bp upstream regions that have been put into the database are marked with red boxes.
Figure 3
Figure 3
Examples of the DoOP dataviews. In the picture of the cluster the boxes numbered from m1 to m13 show the conserved motifs, the black box shows a predicted repetitive element, while the blue box shows the 5′-UTR.

Similar articles

See all similar articles

Cited by 10 articles

See all "Cited by" articles


    1. Fickett J.W. and Hatzigeorgiou,A.C. (1997) Eukaryotic promoter recognition. Genome Res., 7, 861–878. - PubMed
    1. Wingender E., Chen,X., Fricke,E., Geffers,R., Hehl,R., Liebich,I., Krull,M., Matys,V., Michael,H., Ohnhauser,R. et al. (2001) The TRANSFAC system on gene expression regulation. Nucleic Acids Res., 29, 281–283. - PMC - PubMed
    1. Lescot M., Dehais,P., Thijs,G., Marchal,K., Moreau,Y., Van de Peer,Y., Rouze,P. and Rombauts,S. (2002) PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences. Nucleic Acids Res., 30, 325–327. - PMC - PubMed
    1. Rombauts S., Florquin,K., Lescot,M., Marchal,K., Rouze,P. and Van de Peer,Y. (2003) Computational approaches to identify promoters and cis-regulatory elements in plant genomes. Plant Physiol., 132, 1162–1176. - PMC - PubMed
    1. van Helden J., del Olmo,M. and Perez-Ortin,J.E. (2000) Statistical analysis of yeast genomic downstream sequences reveals putative polyadenylation signals. Nucleic Acids Res., 28, 1000–1010. - PMC - PubMed

Publication types