Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Dec 16;42(22):13557-72.
doi: 10.1093/nar/gku885. Epub 2014 Nov 6.

Aberrant Transcriptional Regulations in Cancers: Genome, Transcriptome and Epigenome Analysis of Lung Adenocarcinoma Cell Lines

Affiliations
Free PMC article

Aberrant Transcriptional Regulations in Cancers: Genome, Transcriptome and Epigenome Analysis of Lung Adenocarcinoma Cell Lines

Ayako Suzuki et al. Nucleic Acids Res. .
Free PMC article

Abstract

Here we conducted an integrative multi-omics analysis to understand how cancers harbor various types of aberrations at the genomic, epigenomic and transcriptional levels. In order to elucidate biological relevance of the aberrations and their mutual relations, we performed whole-genome sequencing, RNA-Seq, bisulfite sequencing and ChIP-Seq of 26 lung adenocarcinoma cell lines. The collected multi-omics data allowed us to associate an average of 536 coding mutations and 13,573 mutations in promoter or enhancer regions with aberrant transcriptional regulations. We detected the 385 splice site mutations and 552 chromosomal rearrangements, representative cases of which were validated to cause aberrant transcripts. Averages of 61, 217, 3687 and 3112 mutations are located in the regulatory regions which showed differential DNA methylation, H3K4me3, H3K4me1 and H3K27ac marks, respectively. We detected distinct patterns of aberrations in transcriptional regulations depending on genes. We found that the irregular histone marks were characteristic to EGFR and CDKN1A, while a large genomic deletion and hyper-DNA methylation were most frequent for CDKN2A. We also used the multi-omics data to classify the cell lines regarding their hallmarks of carcinogenesis. Our datasets should provide a valuable foundation for biological interpretations of interlaced genomic and epigenomic aberrations.

Figures

Figure 1.
Figure 1.
Whole-genome sequencing for genomic aberrations. (A) The number of SNVs and indels detected in the 26 cell lines. For each cell line, the number of all somatic mutation candidates and those in the protein-coding regions are shown in the upper and lower panels, respectively. The x-axis is sorted by the origins of the cell lines and the increasing total number of non-synonymous SNVs and indels. (B) Examples of copy number information. The normalized copy number profiles of H1703 and LC2/ad are shown in the upper and lower panels, respectively. Examples of genes for which possible CNAs are detected are indicated by arrows (red for amplification and blue for deletion). (C) Examples of mutated genes in the 26 cell lines. Mutations identified in the EGFR, TP53 and SMARCA4 genes are shown. Types of mutations are as indicated in the inset. One mutation in the TP53 gene was added by manual inspection. (D) Genomic aberration of the selected 26 cancer-related genes. SNVs and indels on the protein-coding regions and splice sites and CNAs are shown.
Figure 2.
Figure 2.
RNA-Seq for transcriptome analyses. (A) The number of mutations on expressed genes (> 1 RPKM) and non-expressed genes for each cell line. Non-synonymous SNVs (red) and indels (blue) in the protein coding regions were counted depending on whether their harboring genes are expressed (bright) or not (pale). The x-axis is sorted in the same order as Figure 1A. (B) Aberrant splicing events with splice site mutations. For the NF1 gene, IGV visualizes splice site SNVs in whole-genome sequences and the 19th exon skipping in RNA-Seq of PC-7 compared with RNA-Seq of PC-9, A549 and H322 (C) Examples of fusion transcripts detected in this study. CCDC6-RET fusion in LC2/ad, EFHD1-UBR3 fusion in PC-9 and ERGIC2-CHRNA6 fusion in H1347 are validated by RT-PCR. (D) The numbers of differentially expressed genes are shown for the 26 cell lines (top panel for genes with higher expression and bottom panel for genes with lower expression). (E) Gene expression patterns of the 26 cancer-related genes. The heat map represents the fold value against the average expression level in the 26 cell lines. The color key is as shown in the inset.
Figure 3.
Figure 3.
Bisulfite sequencing for analyzing DNA methylation status. (A) Summary of DNA methylation in each cell line. Upper panel: average DNA methylation rates are calculated at each CpG site in CpG islands or non-CpG islands to draw the heat map. Lower panel: Results of a similar analysis for the CpG islands, CpG shores and promoters. The color key is shown in the inset. (B) The numbers of differentially hyper- (upper panel) or hypo- (lower panel) methylated genes in each of the 26 cell lines. The populations of the genes having the indicated fold changes are separately colored as shown in the insets (C) DNA methylation patterns, as indicated by the color key, are shown for the representative promoters of the selected 26 cancer-related genes. Slashes indicate where the genomic deletion was observed. (D) DNA methylation of the CDKN2A gene. The degree of methylation at each CpG site (vertical line) is colored as indicated in the inset. Each line represents the information for the indicated cell line. Cell lines for which genomic deletions were observed are also indicated. SNVs and indels detected in p16INK4a were shown in red letters. A gene model is shown in the bottom.
Figure 4.
Figure 4.
ChIP-Seq for the eight chromatin marks. (A) Correlation among the eight chromatin signatures. Spearman's rank correlation coefficients were calculated between the indicated pair of chromatin marks and colored following the color key shown in the inset. Averages of 26 cell lines were used to assign the colors. (B) The numbers of differentially utilized chromatin marks for the 26 cell lines. Transcriptional active marks, repressive marks and enhancer marks are represented in the upper, middle and lower panels, respectively. (C) Chromatin states based on ChromHMM for the 26 cancer-related genes. ChromHMM maps were drawn for each cell line (see the Materials and Methods section and Supplementary Figure S20). Chromatin states that most frequently appeared in the promoter, gene body and enhancers of each gene are shown in the left, middle and right panels, respectively.
Figure 5.
Figure 5.
Integrative analysis of multi-omics data. Transcriptomic and epigenomic status of the CDKN1A gene. Expression levels of CDKN1A are shown in the upper graph. RNA-Seq, bisulfite sequencing and ChIP-Seq patterns of CDKN1A are also shown for the four cell lines, indicated in the graph.
Figure 6.
Figure 6.
Multi-layerd aberrations in ‘hallmarks of cancer’. (A) Potential aberrant events in genome, epigenome and transcriptome in each of 10 hallmarks of cancer. In the cases of PC-3 (left) and PC-7 (right), percentages of genes with mutations, differential expression, differential DNA methylation and differential chromatin marks (H3K4me3, H3K27me3 and H3K9me3) are shown. (B) Percentages of genes with differential epigenomic marks in ‘Avoiding Immune Destruction’ for PC-3 and PC-7. (C) Aberrant epigenomic and transcriptomic events in cancer cell lines compared to SAEC. Percentages of genes with differential higher or lower expression and chromatin marks were shown for the 26 cell lines. Merged percentages when all of the 26 cell lines are considered are shown. Small square columns in the surrounding margin represent the frequencies in individual cell lines. Color code for the frequency and the order of the cell lines are shown in the right margin. (D) Comparison with the variations in the aberrant events (differential features) when compared within cancerous cell lines and deviations from a normal cell, SAEC. Percentages of aberrant features (y-axis) in hallmarks of ‘Enabling Replicative Immortality’ (top), ‘Genome Instability and Mutation’ (middle) and ‘Avoiding Immune Destruction’ (bottom) in the transcriptome layer are shown for the indicated cell lines (x-axis). Solid and broken lines represent the frequencies compared to SAEC and averages of the 26 cancer cell lines, respectively. Cell lines are ordered on the x-axis in order of the increasing frequencies of aberrations in comparison with SAEC (solid line).

Similar articles

See all similar articles

Cited by 32 articles

See all "Cited by" articles

References

    1. Ding L., Getz G., Wheeler D.A., Mardis E.R., McLellan M.D., Cibulskis K., Sougnez C., Greulich H., Muzny D.M., Morgan M.B., et al. Somatic mutations affect key pathways in lung adenocarcinoma. Nature. 2008;455:1069–1075. - PMC - PubMed
    1. Imielinski M., Berger A.H., Hammerman P.S., Hernandez B., Pugh T.J., Hodis E., Cho J., Suh J., Capelletti M., Sivachenko A., et al. Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing. Cell. 2012;150:1107–1120. - PMC - PubMed
    1. Seo J.S., Ju Y.S., Lee W.C., Shin J.Y., Lee J.K., Bleazard T., Lee J., Jung Y.J., Kim J.O., Yu S.B., et al. The transcriptional landscape and mutational profile of lung adenocarcinoma. Genome Res. 2012;22:2109–2119. - PMC - PubMed
    1. Li C., Fang R., Sun Y., Han X., Li F., Gao B., Iafrate A.J., Liu X.Y., Pao W., Chen H., et al. Spectrum of oncogenic driver mutations in lung adenocarcinomas from East Asian never smokers. PLoS One. 2011;6:e28204. - PMC - PubMed
    1. Shigematsu H., Lin L., Takahashi T., Nomura M., Suzuki M., Wistuba II, Fong K.M., Lee H., Toyooka S., Shimizu N., et al. Clinical and biological features associated with epidermal growth factor receptor gene mutations in lung cancers. J. Natl Cancer Inst. 2005;97:339–346. - PubMed

Publication types

Feedback