Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jan;41(Database issue):D157-64.
doi: 10.1093/nar/gks1233. Epub 2012 Nov 27.

EPD and EPDnew, High-Quality Promoter Resources in the Next-Generation Sequencing Era

Affiliations
Free PMC article

EPD and EPDnew, High-Quality Promoter Resources in the Next-Generation Sequencing Era

René Dreos et al. Nucleic Acids Res. .
Free PMC article

Abstract

The Eukaryotic Promoter Database (EPD), available online at http://epd.vital-it.ch, is a collection of experimentally defined eukaryotic POL II promoters which has been maintained for more than 25 years. A promoter is represented by a single position in the genome, typically the major transcription start site (TSS). EPD primarily serves biologists interested in analysing the motif content, chromatin structure or DNA methylation status of co-regulated promoter subsets. Initially, promoter evidence came from TSS mapping experiments targeted at single genes and published in journal articles. Today, the TSS positions provided by EPD are inferred from next-generation sequencing data distributed in electronic form. Traditionally, EPD has been a high-quality database with low coverage. The focus of recent efforts has been to reach complete gene coverage for important model organisms. To this end, we introduced a new section called EPDnew, which is automatically assembled from multiple, carefully selected input datasets. As another novelty, we started to use chromatin signatures in addition to mRNA 5'tags to locate promoters of weekly expressed genes. Regarding user interfaces, we introduced a new promoter viewer which enables users to explore promoter-defining experimental evidence in a UCSC genome browser window.

Figures

Figure 1.
Figure 1.
Positional distribution of TSSs and selected chromatin marks in an ENSEMBL-derived human promoter set (A), in the subset of ENSEMBL promoters that was selected for inclusion in EPD (B), and in promoters from EPDnew where TSS positions where re-assigned with the aid of new CAGE and oligo-capping data (C). TSS tags corresponding to the 5′-ends of oligo-capped cDNAs were taken fromDBTSS version 7 (14). ChIP-Seq data for H3K4me3 and Pol-II were taken from (7).
Figure 2.
Figure 2.
Work and data flow in EPDnew. (A) Physical and logical connections between source data, automatic procedures and human intervention in the development and production of EPDnew. (B) Details of the promoter assembly pipeline used in the production of the current Drosophila promoter collection of EPDnew.
Figure 3.
Figure 3.
EPD viewer screenshot for the human MRP-L35 promoter. The image was automatically generated and downloaded from the UCSC genome browser (15). The EPD-supplied tracks show experimental TSS sites from DBTSS7 (14) and FANTOM4 (26), chromatin marks from (7) and DNA methylome data from (8). The CpG island, genome conservation and repetitive element tracks are from the UCSC genome browser database (19). The EPD viewer page contains a link which enables users to automatically upload the EPD-supplied tracks to the UCSC genome browser for further customization and dynamic exploration of the promoter regions.
Figure 4.
Figure 4.
DNA motif-based evaluation of promoter collections. Shown are the positional distributions of the (A) TATA- and (B) CCAAT-boxes in an ENSEMBL-derived human promoter collection, in the subset of ENSEMBL promoters that was selected for inclusion in EPD, and in EPDnew. The higher peaks and the lower back-ground frequency of the TATA-box motif in panel A indicate that EPDnew is of higher quality than ENSEMBL.

Similar articles

See all similar articles

Cited by 47 articles

See all "Cited by" articles

References

    1. Cavin Perier R, Junier T, Bucher P. The Eukaryotic Promoter Database EPD. Nucleic Acids Res. 1998;26:353–357. - PMC - PubMed
    1. Suzuki Y, Yamashita R, Nakai K, Sugano S. DBTSS: DataBase of human Transcriptional Start Sites and full-length cDNAs. Nucleic Acids Res. 2002;30:328–331. - PMC - PubMed
    1. Shiraki T, Kondo S, Katayama S, Waki K, Kasukawa T, Kawaji H, Kodzius R, Watahiki A, Nakamura M, Arakawa T, et al. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc. Natl Acad. Sci. USA. 2003;100:15776–15781. - PMC - PubMed
    1. Tateno Y, Saitou N, Okubo K, Sugawara H, Gojobori T. DDBJ in collaboration with mass-sequencing teams on annotation. Nucleic Acids Res. 2005;33:D25–D28. - PMC - PubMed
    1. Schmid CD, Praz V, Delorenzi M, Perier R, Bucher P. The Eukaryotic Promoter Database EPD: the impact of in silico primer extension. Nucleic Acids Res. 2004;32:D82–D85. - PMC - PubMed

Publication types

Feedback