Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Aug 18;10:381.
doi: 10.1186/1471-2164-10-381.

Relationship Between Estrogen Receptor Alpha Location and Gene Induction Reveals the Importance of Downstream Sites and Cofactors

Affiliations
Free PMC article

Relationship Between Estrogen Receptor Alpha Location and Gene Induction Reveals the Importance of Downstream Sites and Cofactors

Fabio Parisi et al. BMC Genomics. .
Free PMC article

Abstract

Background: To understand cancer-related modifications to transcriptional programs requires detailed knowledge about the activation of signal-transduction pathways and gene expression programs. To investigate the mechanisms of target gene regulation by human estrogen receptor alpha (hERalpha), we combine extensive location and expression datasets with genomic sequence analysis. In particular, we study the influence of patterns of DNA occupancy by hERalpha on expression phenotypes.

Results: We find that strong ChIP-chip sites co-localize with strong hERalpha consensus sites and detect nucleotide bias near hERalpha sites. The localization of ChIP-chip sites relative to annotated genes shows that weak sites are enriched near transcription start sites, while stronger sites show no positional bias. Assessing the relationship between binding configurations and expression phenotypes, we find binding sites downstream of the transcription start site (TSS) to be equally good or better predictors of hERalpha-mediated expression as upstream sites. The study of FOX and SP1 cofactor sites near hERalpha ChIP sites shows that induced genes frequently have FOX or SP1 sites. Finally we integrate these multiple datasets to define a high confidence set of primary hERalpha target genes.

Conclusion: Our results support the model of long-range interactions of hERalpha with the promoter-bound cofactor SP1 residing at the promoter of hERalpha target genes. FOX motifs co-occur with hERalpha motifs along responsive genes. Importantly we show that the spatial arrangement of sites near the start sites and within the full transcript is important in determining response to estrogen signaling.

Figures

Figure 1
Figure 1
EREs and ChIP sites. A. Number of hERα sites for 1 kbp sequences centered around the ChIP sites identified by SLM. The number of sites is computed from a Hidden Markov Model (cf. Methods) using posterior decoding. Results are stratified in function of the strength of the binding site (t-score). The density profile (red) shows bimodality for high t-scores. The median (dots) is calculated in bins of one unit in t-scores. A smoothed estimator (in grey) has been added as visual aid. The cut-offs used for defining highest (t < 16) and lower stringency sites (t > 10) are indicated with vertical lines. The monotonous trend can be approximated by a sigmoid (tanh) function with half-height at t~10 and saturating at t~16 (>90%). B. Left: Average occupation profile at each genomic position computed using posterior decoding for the hERα consensus (e.g. 0.01 means that 1% of sequences have an ERE at this precise position). The profile is centered on the mode of the ChIP-chip site (red dashed line). Right: Fraction of EREs within a given radius of the mode of the ChIP signal. The ChIP sites identified with SLM have a width of about 1 kbp (width of the peak) while the binding sites for the 80% sites with a consensus (one position with posterior probability >0.5) are found within 200-bp of the mode in the t-profile.
Figure 2
Figure 2
Characteristics of weak and strong ChIP sites. A. Average nucleotide composition profile for ChIP sites with ERE consensus sites (posterior probability > 0.5). The sequences are centered on the ERE. Both sets, the low (left panel) and high (right panel) stringency sites, show a maximum GC enrichment within 200 bp of the ERE. Notice that GC content has not reached genome wide baseline at +/- 2.5 kbp, and drop-off is faster for the stronger sites (right). Each gray dot represents the mean frequency at one position, smoothed mean (black) +/- 2SD (gray) and shown as lines. B. Localization of hERα binding sites relative to annotated transcription start sites (TSSs) and poly-adenylation sites (PASs). The percentage of occurrence is calculated relative to the number of sites in the full window (± 50 kbp of TSS or of PAS, bin size 500 bp). Coordinates are taken positive in the transcript direction but results show absence of directionality in the profiles. Left panel: Distribution of distances from TSSs for sites with 10<t<16 mapped in the 5' regions. The noticeable peak around the TSS covers 12% of the total number of sites in the region. We thus find a tight colocalization with the TSS (defined as 0, green profile) for a subset of sites. In contrast, no colocalization is evident for the PAS (red profile). Right panel: Distribution of distances from TSSs for sites with t>16 mapped in the 5' regions. In this case, sites are uniformly distributed in the 50 kbp around the TSS (green profile) and around the PAS (red profile).
Figure 3
Figure 3
ROC analysis for comparing the ability of upstream or downstream ChIP sites to predict induced genes. In each experiment, the induced genes (positives) are taken as the 1% highest ranking transcripts. The remaining 99% are taken as the negatives. Note that for discrete data such as the number of sites in a specific genomic window, ROC analysis consists in a set of operative points (cf. Methods). The number of sites downstream of the TSS (light curves) shows the best performance among all the definitions tested (high stringency and low stringency ChIP-chip, ChIP-pet). Shown are the cancer expression compendium [12] and the study on primary estrogen receptor targets [20]. Further expression sets are shown in [see Additional file 3].
Figure 4
Figure 4
ROC analysis to compare the ability of ChIP sites in variably sized windows to predict induced genes. Positives and negatives are taken as in Fig. 3. Number and definition of operative points is as in Fig. 3. The discrete curves lie on the same envelope, but the number of sites along the transcript (black curve) shows the best performance for all the sites definitions used (high stringency and low stringency ChIP-chip, ChIP-pet). The expression datasets are the cancer expression compendium and the study on primary estrogen receptor targets. In each experiment, the induced genes (positives) are taken as the 1% highest ranking transcripts. The remaining 99% are taken as the negatives. The same analyses using different definitions of sites (high stringency or ChIP-pet) are given in [see Additional file 4]. Further expression sets are analyzed in [see Additional file 5].
Figure 5
Figure 5
Response to hERα increases in function of the number of ChIP sites along transcripts. Ranks of the induction scores are shown as boxplots in function of the number of ChIP sites. In B-D, ChIP sites are further filtered according to the presence or absence of consensus elements for hERα (A) or FOX (C, D). A motif is assigned to a binding site if the occupancy, computed using posterior decoding, is greater than 0.5 (cf. Methods). The ranks of the induction scores of the cancer expression compendium and of the study on primary estrogen receptor targets have been pooled to avoid small sample size effects. Significance of the comparisons is assessed using the Wilcoxon rank sum test, i.e. comparison between equally colored distributions are made. A-B. Effect of ERE motifs. For ChIP-pet sites the presence of an ERE improves the correlation between ranks and number of sites (orange boxes). No statistically significant improvement is detected for the ChIP-chip sites (blue boxes). C-D. Effect of FOX motifs. In addition to the presence of an ERE, the presence of a FOX motif improves significantly the association for the ChIP-pet sites (red box). Comparison with panel B (black boxes) indicates that many ChIP-pet sites with EREs also have FOX sites. Despite a shift in the distribution to higher ranks, no statistically significant improvement is detected for the ChIP-chip sites (purple boxes) with a FOX site.
Figure 6
Figure 6
SP1 acts as a cofactor for promoter proximal hERα sites. Ranks of the induction scores are shown as boxplots in function of the number of promoter proximal (± 5 kbp around the TSS) ChIP sites with and without SP1 sites. The presence of SP1 motifs underlying promoter proximal ChIP sites increases the induction rank. Ranks are computed as in Fig. 5 and SP1 motifs are assigned to a ChIP site if the occupancy is greater than 0.5 (Cf. Methods). Significance is assessed by Wilcoxon rank sum test.

Similar articles

See all similar articles

Cited by 3 articles

References

    1. Jordan VC. The past, present, and future of selective estrogen receptor modulation. Ann N Y Acad Sci. 2001;949:72–79. - PubMed
    1. Ali S, Coombes RC. Endocrine-responsive breast cancer and strategies for combating resistance. Nat Rev Cancer. 2002;2(2):101–112. doi: 10.1038/nrc721. - DOI - PubMed
    1. Simpson ER. Sources of estrogen and their importance. J Steroid Biochem Mol Biol. 2003;86(3–5):225–230. doi: 10.1016/S0960-0760(03)00360-1. - DOI - PubMed
    1. Walker P, Germond JE, Brown-Luedi M, Givel F, Wahli W. Sequence homologies in the region preceding the transcription initiation site of the liver estrogen-responsive vitellogenin and apo-VLDLII genes. Nucleic Acids Res. 1984;12(22):8611–8626. doi: 10.1093/nar/12.22.8611. - DOI - PMC - PubMed
    1. Sanchez R, Nguyen D, Rocha W, White JH, Mader S. Diversity in the mechanisms of gene regulation by estrogen receptors. Bioessays. 2002;24(3):244–254. doi: 10.1002/bies.10066. - DOI - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources

Feedback