Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Aug 16;16(7):2032-46.
doi: 10.1016/j.celrep.2016.07.028. Epub 2016 Aug 4.

Direct Transcriptional Consequences of Somatic Mutation in Breast Cancer

Affiliations

Direct Transcriptional Consequences of Somatic Mutation in Breast Cancer

Adam Shlien et al. Cell Rep. .

Abstract

Disordered transcriptomes of cancer encompass direct effects of somatic mutation on transcription, coordinated secondary pathway alterations, and increased transcriptional noise. To catalog the rules governing how somatic mutation exerts direct transcriptional effects, we developed an exhaustive pipeline for analyzing RNA sequencing data, which we integrated with whole genomes from 23 breast cancers. Using X-inactivation analyses, we found that cancer cells are more transcriptionally active than intermixed stromal cells. This is especially true in estrogen receptor (ER)-negative tumors. Overall, 59% of substitutions were expressed. Nonsense mutations showed lower expression levels than expected, with patterns characteristic of nonsense-mediated decay. 14% of 4,234 rearrangements caused transcriptional abnormalities, including exon skips, exon reusage, fusions, and premature polyadenylation. We found productive, stable transcription from sense-to-antisense gene fusions and gene-to-intergenic rearrangements, suggesting that these mutation classes drive more transcriptional disruption than previously suspected. Systematic integration of transcriptome with genome data reveals the rules by which transcriptional machinery interprets somatic mutation.

PubMed Disclaimer

Figures

None
Graphical abstract
Figure 1
Figure 1
Separating Expression of X-Linked Genes into Stromal and Tumor Compartments (A) Fraction of RNA-seq reads reporting reference allele of heterozygous germline SNPs on the X chromosome in one of the patients (PD4120a). The depth of color reflects the level of expression. (B) Fraction of transcripts derived from tumor cells for each heterozygous germline SNP shown in (A), estimated with a Bayesian Dirichlet process, is shown. (C) Estimated distribution and 95% posterior intervals for relative gene expression in cancer versus stromal cells for PD4120a. The y axis reports the estimated density of genes; the x axis reports the fraction of transcripts for each gene deriving from cancer cells. Thus, the transcripts for most genes in PD4120a are 80%–100% derived from cancer cells and 0%–20% from stromal cells, with only a small peak of genes predominantly expressed from stromal cells. (D) Distributions for several selected primary cancers are shown, as for (C). (E) Overall fraction of transcripts derived from cancer cells (y axis) compared to the estimated proportion on tumor cells in the sample (x axis, estimated from genomic DNA using copy-number profiles) is shown. (F) Increased expression of the mutated allele in ER− as compared to ER+ breast cancer transcriptomes (plotted relative to the genome). Primary breast cancers sequenced as part of TCGA are shown. Plotted on the y axis is the variant allele fraction in the transcriptome, relative to the genome (VAFdiff). (G) Inverse relationship between each tumors’ expression of Estrogen Receptor 1 (ESR1) and the overall expression of its point mutations (shown as VAFdiff; −0.2433, p < 0.0001). Using linear regression analysis to model this relationship, we determined that, for every 1% drop in ESR1, ∼15 additional point mutations are expressed.
Figure 2
Figure 2
The Effect of Somatic Point Mutations on Expression and Aberrant Splicing (A) Comparison of the variant allele fractions in the transcriptome to the genome, for all classes of point mutation. The squared correlation coefficient between the genome and transcriptome is in parentheses. Only expressed coding changes are shown (five or more times coverage). (B) Variant allele fractions in the transcriptome relative to the genome. Nonsense mutations >50 bp from the terminal 3′ exon-intron junction are the only variants to show a significant difference. (C) Positional effect of mutations on aberrant splicing is shown.
Figure 3
Figure 3
The Transcriptional Consequences of Structural Rearrangement All rearrangement types and their position with respect to genes are shown as a matrix in both panels. Transcriptional disruptions caused by each rearrangement type are shown within the matrix. (A) Number of rearrangements causing aberrant transcription. Normalized aberrant transcription levels were contrasted between the sample that contained the rearrangement and all others. Plotted is the aberrant transcription ranking of the rearranged sample relative to all others for the same genes (red bars). The pie charts show the fraction of all rearrangements of that type that are excess in the final rank compared to the number expected under a uniform distribution. (B) Types of aberrant transcriptional events caused by rearrangements are shown.
Figure 4
Figure 4
Rearrangements between and within Genes (A) Fusions caused by rearranged genes in the same orientation are shown. (B) Proportion of rearrangements predicted to lead to an in-frame event contrasted to the proportion actually expressing in-frame transcripts (top). Characteristics of expressed fusions (bottom) are shown. (C) Many fusions are expressed in multiple isoforms.
Figure 5
Figure 5
Antisense Expression Caused by Rearranged Genes in Opposite Orientation (A) Stacked bar plot shows the number of expressed transcripts per sample resulting from gene fusions in opposite orientation. (B) The diversity of chimeric transcripts produced by gene-to-gene rearrangements. The expression level of each transcript is plotted on the y axis. Tail-to-tail gene pairs (green) are rarely expressed, whereas, surprisingly, sense-to-sense and sense-to-antisense fusions show similar levels of expression (blue and red, respectively). Transcripts are placed on the x axis according to the type of read joining the two genes. Genes adjoined by exonic reads are plotted to the right on the x axis, and genes brought together only by exon-to-intron reads are on the left. (C) Examples of productive, stable antisense fusion transcripts. Plotted on the y axis are the read depths supporting the fusion. Hatched lines indicate rearrangement breakpoints. In most cases, we observed a single donor gene, which expresses sequence from its sense strand (yellow), and a single acceptor gene, which expresses sequence from its antisense strand (red). Rarely are both promoters used, leading to reciprocal sense-antisense fusions (both genes express sense and antisense sequence). The fusions SZT2-SLC6A9 and SLC6A9-SZT2 are examples of a reciprocal pair. In general, antisense transcripts display features of traditional exons: they are stably expressed, around 200 bp, and are frequently spliced at GT-AG splice sites (asterisks).
Figure 6
Figure 6
Non-canonical Fusions Caused by Gene-to-Intergenic Breakpoints (A) Percentage of gene-to-intergenic rearrangements causing fusions is shown. (B) Length of the intron created is shown. (C) Genes involved in non-canonical fusions. We observed 18 fusions where a broken gene (donor) splices to another gene (acceptor) that is itself unbroken and often distant. These fusions can be highly expressed (width of line) and cause in-frame transcripts (red line).
Figure 7
Figure 7
Regions of Local Complexity Give Rise to Unique Transcriptional Consequences A region of local complexity is any gene footprint that contains two or more genomic rearrangements. Local complexity can occur in regions of chromothripsis and high-level amplification. (A) Proportion of simple and complex rearrangements that lead to an expressed transcript, grouped by sample, is shown. (B) Regions of local complexity and their transcriptional consequences. Two samples’ regions of complexity are shown as pairs of Circos plots. The genomic events one would predict to be expressed are highlighted (blue arcs). Often the tumors do not express these events, or they amalgamate multiple cis rearrangements and express a transcript that combines genes only indirectly linked to another.

Similar articles

Cited by

References

    1. Asmann Y.W., Hossain A., Necela B.M., Middha S., Kalari K.R., Sun Z., Chai H.S., Williamson D.W., Radisky D., Schroth G.P. A novel bioinformatics pipeline for identification and characterization of fusion transcripts in breast cancer and normal cell lines. Nucleic Acids Res. 2011;39:e100. - PMC - PubMed
    1. Camós M., Esteve J., Jares P., Colomer D., Rozman M., Villamor N., Costa D., Carrió A., Nomdedéu J., Montserrat E., Campo E. Gene expression profiling of acute myeloid leukemia with translocation t(8;16)(p11;p13) and MYST3-CREBBP rearrangement reveals a distinctive signature with a specific pattern of HOX gene expression. Cancer Res. 2006;66:6947–6954. - PubMed
    1. Cancer Genome Atlas Network Comprehensive molecular portraits of human breast tumours. Nature. 2012;490:61–70. - PMC - PubMed
    1. Cancer Genome Atlas Research Network Comprehensive genomic characterization of squamous cell lung cancers. Nature. 2012;489:519–525. - PMC - PubMed
    1. Chen K., Wallis J.W., Kandoth C., Kalicki-Veizer J.M., Mungall K.L., Mungall A.J., Jones S.J., Marra M.A., Ley T.J., Mardis E.R. BreakFusion: targeted assembly-based identification of gene fusions in whole transcriptome paired-end sequencing data. Bioinformatics. 2012;28:1923–1924. - PMC - PubMed

Substances