Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Mar 7:7:114.
doi: 10.1186/1471-2105-7-114.

Genome-wide analysis of core promoter elements from conserved human and mouse orthologous pairs

Affiliations

Genome-wide analysis of core promoter elements from conserved human and mouse orthologous pairs

Victor X Jin et al. BMC Bioinformatics. .

Abstract

Background: The canonical core promoter elements consist of the TATA box, initiator (Inr), downstream core promoter element (DPE), TFIIB recognition element (BRE) and the newly-discovered motif 10 element (MTE). The motifs for these core promoter elements are highly degenerate, which tends to lead to a high false discovery rate when attempting to detect them in promoter sequences.

Results: In this study, we have performed the first analysis of these core promoter elements in orthologous mouse and human promoters with experimentally-supported transcription start sites. We have identified these various elements using a combination of positional weight matrices (PWMs) and the degree of conservation of orthologous mouse and human sequences--a procedure that significantly reduces the false positive rate of motif discovery. Our analysis of 9,010 orthologous mouse-human promoter pairs revealed two combinations of three-way synergistic effects, TATA-Inr-MTE and BRE-Inr-MTE. The former has previously been putatively identified in human, but the latter represents a novel synergistic relationship.

Conclusion: Our results demonstrate that DNA sequence conservation can greatly improve the identification of functional core promoter elements in the human genome. The data also underscores the importance of synergistic occurrence of two or more core promoter elements. Furthermore, the sequence data and results presented here can help build better computational models for predicting the transcription start sites in the promoter regions, which remains one of the most challenging problems.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The schematic diagram of the core promoter elements.
Figure 2
Figure 2
Average sequence similarity between orthologous human and mouse promoter regions is very high at the transcription start site (vertical dotted line), and drops sharply both up- and downstream of this point. Points indicate the mean percent identity in sliding 20-base windows along our dataset of 9,010 orthologous mouse-human promoter pairs. 95% confidence interval bars are plotted at every 10 bases.
Figure 3
Figure 3
The number of core promoter elements in the real promoter sequences (dark bars) significantly exceeds the numbers found in the randomized sequences (light bars) in all cases. When we add the criterion that the element must be conserved in the mouse genome, we find that the gap between the number of elements found in the real data versus the random data widens, indicating an increase in the signal-to-noise ratio.
Figure 4
Figure 4
The number of each motif discovered in its expected position relative to the true TSS (position zero) represents a local maximum when compared to the sequence immediately upstream. The dotted lines show the results for a single-genome scan, while the solid lines show the results when only those motifs that are conserved in the orthologous mouse promoter are accepted.
Figure 5
Figure 5
Pairs of motifs are much more likely to occur at the TSS than within sequences up- or downstream of the TSS. The dotted and solid lines are as described in the legend for Figure 4.

Similar articles

Cited by

References

    1. Butler JE, Kadonaga JT. The RNA polymerase II core promoter: a key component in the regulation of gene expression. Genes Dev. 2002;16:2583–2592. doi: 10.1101/gad.1026202. - DOI - PubMed
    1. Hochheimer A, Tjian R. Diversified transcription initiation complexes expand promoter selectivity and tissue-specific gene expression. Genes Dev. 2003;17:1309–1320. doi: 10.1101/gad.1099903. - DOI - PubMed
    1. Woychik NA, Hampsey M. The RNA polymerase II machinery: structure illuminates function. Cell. 2002;108:453–463. doi: 10.1016/S0092-8674(02)00646-3. - DOI - PubMed
    1. Hampsey M. Molecular genetics of the RNA polymerase II general transcriptional machinery. Microbiol Mol Biol Rev. 1998;62:465–503. - PMC - PubMed
    1. Schumacher MA, Lau AO, Johnson PJ. Structural basis of core promoter recognition in a primitive eukaryote. Cell. 2003;115:413–424. doi: 10.1016/S0092-8674(03)00887-0. - DOI - PubMed

Publication types

LinkOut - more resources