Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 19 (1), 190

Recurrent Mutations at Estrogen Receptor Binding Sites Alter Chromatin Topology and Distal Gene Expression in Breast Cancer

Affiliations

Recurrent Mutations at Estrogen Receptor Binding Sites Alter Chromatin Topology and Distal Gene Expression in Breast Cancer

Jiekun Yang et al. Genome Biol.

Abstract

Background: The mutational processes underlying non-coding cancer mutations and their biological significance in tumor evolution are poorly understood. To get better insights into the biological mechanisms of mutational processes in breast cancer, we integrate whole-genome level somatic mutations from breast cancer patients with chromatin states and transcription factor binding events.

Results: We discover that a large fraction of non-coding somatic mutations in estrogen receptor (ER)-positive breast cancers are confined to ER binding sites. Notably, the highly mutated estrogen receptor binding sites are associated with more frequent chromatin loop contacts and the associated distal genes are expressed at higher level. To elucidate the functional significance of these non-coding mutations, we focus on two of the recurrently mutated estrogen receptor binding sites. Our bioinformatics and biochemical analysis suggest loss of DNA-protein interactions due to the recurrent mutations. Through CRISPR interference, we find that the recurrently mutated regulatory element at the LRRC3C-GSDMA locus impacts the expression of multiple distal genes. Using a CRISPR base editor, we show that the recurrent C→T conversion at the ZNF143 locus results in decreased TF binding, increased chromatin loop formation, and increased expression of multiple distal genes. This single point mutation mediates reduced response to estradiol-induced cell proliferation but increased resistance to tamoxifen-induced growth inhibition.

Conclusions: Our data suggest that ER binding is associated with localized accumulation of somatic mutations, some of which affect chromatin architecture, distal gene expression, and cellular phenotypes in ER-positive breast cancer.

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
ER binding is associated with increased somatic mutation rates in breast cancer. Heatmaps show DNase I sequencing read intensity as a measure of DNase hypersensitivity in MCF-7 cells (ENCODE) and ER ChIP-seq read intensity in 21 ER+ breast cancer samples profiled by Ross-Innes et al. [23]. Observed somatic mutation rates (red line) for 560 ER+ breast cancer patients (ICGC BRCA-EU) [4] were calculated for sites with different ER binding and DNase hypersensitivity intensity. Expected mutation rates (black line) were calculated based on tri-nucleotide compositions of corresponding genomic sequences using previously established method [15]. Fold changes (blue bar) are comparing the observed mutation rates within 200 bp of ER binding or DHS peaks with the rates in flanking regions (> 200 bp and ≤ 1 kb); corresponding P values (orange bar) were obtained using chi-square test followed by Benjamini-Hochberg adjustment. a The observed and expected mutation rates were calculated for three sets of DHS sites with comparable intensity: the sites that overlapped with ER ChIP-seq peaks (DHS w/ ERBS), the sites that overlapped with other ENCODE identified TF but not ER binding sites (DHS w/o ERBS) and finally DHS with no TF binding sites (DHS w/o TF BS). b The observed and expected somatic mutation rates for four quartiles of ER binding sites with increasing ER binding intensity are shown. c The observed and expected somatic mutation rates at ERBS shared by more than 3 patients, 2 patients, and patient-specific are shown. Fold changes and P values are shown for each set of ERBS as described above
Fig. 2
Fig. 2
Frequently mutated ERBS are associated with more chromatin loops and higher gene expression. Boxplots presented in this figure illustrate the lower quartile (Q1) and higher quartile (Q3) as the box, median as the line inside the box, and 1.5 × interquartile range (IQR = Q3 - Q1) as the whiskers. a Boxplot depicts corrected long-range chromatin contact frequency from Pol2 ChIA-PET data in MCF-7 cells, for ERBS with different numbers of somatic mutations in BRCA-EU. The contact frequency was corrected using a negative binomial linear regression model to remove the effect of ER binding intensity (Additional file 1: Figure S5). Gray dash line indicates the average corrected contact frequency for all ERBS. b Boxplot represents expression levels of genes that are topologically associated (within the same TAD and associated with ERBS via ChIA-PET loop) or linearly associated (50 kb distance) with ERBS. ERBS were grouped according to the number of mutations within 200 bps of its summit (same as in panel a). c Mean number of somatic mutations is plotted for ERBS that are associated with good outcome, poor outcome/metastasis and shared by at least 75% of breast cancer patients (core ERBS) [23]. The average mutation number was calculated based on random sampling of 100 ERBS from each group for 50 times. P values are calculated using two-sided Student’s t test. d Bar plot shows the number of BRCA-EU patients carrying mutations at the ERBS, which contained the most number of somatic mutations within 200 bps of the summit (except for FOXA1, somatic mutations ~ 100 bps beyond the 200-bp limit were included due to its recurrence) across all the patients. Asterisk indicates if there are recurrent mutations (existing in at least two BRCA-EU patients). Gene symbols for the ERBS within coding regions are shown inside the bars. The two ERBS that are characterized in this study are in bold font
Fig. 3
Fig. 3
A recurrent intergenic somatic mutation disrupts TF binding and decreases expression of distal genes. a Genomic region of two recurrent somatic mutations and their neighboring genes is shown. Inset shows the number of BRCA-EU patients with mutations in the intergenic locus between the LRRC3C and GSDMA genes. Nucleotide changes for the two recurrent mutations and the relative position of the ER peak (gray shadow) are shown. Relevant tracks (ENCODE) and positions of the sgRNAs used in panel d are also displayed. b Motif scores were calculated with and without each mutation using the PWMEnrich package [57], which performs DNA motif enrichment analysis against databases such as MotifDb. Motif score ratios were displayed as blue and red bars representing higher motif scores with and without the mutation, respectively. Downward black arrows indicate the mutation position within each motif. c EMSA results demonstrate protein binding affinity for WT and mutant oligonucleotides (oligos) with either double or single mutations. The three lanes for each case are biotin-labeled oligos only, biotin-labeled oligos plus nuclear extract, and biotin-labeled oligos plus nuclear extract and competitor probes from left to right. Competitor probes are unlabeled oligos to examine DNA-protein binding specificity. Non-specific interactions are labeled as “n.s.”. d Neighboring gene expression levels were assessed by qRT-PCR in MCF-7 cells with CRISPR-dCas9-based interference of control and the mutation sites. All the P values were calculated with two-sided Student’s t test. ***P < 0.001. Error bars represent standard deviations from six biological replicates
Fig. 4
Fig. 4
A recurrent non-coding somatic mutation at the ZNF143 locus affects TF binding, 3D chromatin architecture and expression of multiple distal genes. a Genomic region of the recurrent mutation at the ZNF143 promoter and the neighboring genes is shown. Inset shows the number of BRCA-EU patients with mutations around the ZNF143 promoter. The sequence flanking the C to T mutation and the relative position of the ER peak (gray shadow) to the mutations are shown. Relevant ENCODE sequencing tracks are also displayed. b Motif score ratios were calculated between genomic sequences with and without the mutation. Blue bars indicate higher motif scores with the mutation, thus motif created; red bars represent higher motif scores without the mutation, thus motif disrupted. Downward black arrows indicate the mutation position within each motif. c EMSA results demonstrate protein binding affinity for WT and mutant (with the C>T mutation; Mut) oligonucleotides. The three lanes for each case are biotin-labeled oligos only, biotin-labeled oligos plus nuclear extract, and biotin-labeled oligos plus nuclear extract and competitor probes from left to right. Competitor probes are unlabeled oligos to examine DNA-protein binding specificity. Non-specific interactions are labeled as “n.s.”. d Schematic representation of the CRISPR base editor approach to introduce the C to T mutation into MCF-7 cells. qPCR was utilized to screen genomes of more than 400 single cell colonies to detect the specific mutation. e Sanger sequencing results show the genomic sequences at and around the mutation site in WT and two mutant (Mut) MCF-7 clones. f ChIP-qPCR analysis shows ZBTB7A enrichment at the mutation site in MCF-7 WT cells and a mutant clone. Error bars represent standard errors of four independent data points (biological replicates). The P value was calculated using one-sided Student’s t test. g qRT-PCR results show relative mRNA levels of genes that are topologically or spatially associated with the mutant site in WT and mutant MCF-7 clones. Error bars represent standard deviations from 11 biological replicates. h Bar graphs show contact frequency between the mutated site and the four other proximal sites in WT and mutant MCF-7 cells as measured by the Chromatin Conformation Capture (3C) assay. Interacting sites from the MCF-7 Pol2 ChIA-PET data are colored in magenta. Hypothetical interaction with the control site is indicated with a gray dash line. The blue boxes at the end of the interaction curves indicate the primer positions used in the 3C assay. Error bars represent standard deviations (2 biological replicates). i Crystal violet colony formation assay measures the relative size and viability of colonies for WT and mutant MCF-7 cells in response to control, estradiol (E2) and tamoxifen (Tam.) treatment. Images and corresponding quantifications are shown. Error bars represent standard deviations from 12 biological replicates. All the P values were calculated with two-sided Student’s t test unless indicated otherwise. ***P < 0.001, **P < 0.01, *P < 0.05

Similar articles

See all similar articles

Cited by 2 PubMed Central articles

References

    1. Yates LR, Campbell PJ. Evolution of the cancer genome. Nat Rev Genet. 2012;13:795–806. doi: 10.1038/nrg3317. - DOI - PMC - PubMed
    1. Cancer Genome Atlas Research N. Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet. 2013;45:1113–1120. doi: 10.1038/ng.2764. - DOI - PMC - PubMed
    1. International Cancer Genome C. Hudson TJ, Anderson W, Artez A, Barker AD, Bell C, Bernabe RR, Bhan MK, Calvo F, Eerola I, et al. International network of cancer genome projects. Nature. 2010;464:993–998. doi: 10.1038/nature08987. - DOI - PMC - PubMed
    1. Nik-Zainal S, Davies H, Staaf J, Ramakrishna M, Glodzik D, Zou X, Martincorena I, Alexandrov LB, Martin S, Wedge DC, et al. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature. 2016;534:47–54. doi: 10.1038/nature17676. - DOI - PMC - PubMed
    1. Weinhold N, Jacobsen A, Schultz N, Sander C, Lee W. Genome-wide analysis of noncoding regulatory mutations in cancer. Nat Genet. 2014;46:1160–1165. doi: 10.1038/ng.3101. - DOI - PMC - PubMed

Publication types

MeSH terms

Feedback