Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 9 (5), e98293
eCollection

Identification of Druggable Cancer Driver Genes Amplified Across TCGA Datasets

Affiliations

Identification of Druggable Cancer Driver Genes Amplified Across TCGA Datasets

Ying Chen et al. PLoS One.

Erratum in

  • PLoS One. 2014;9(9):e107646

Abstract

The Cancer Genome Atlas (TCGA) projects have advanced our understanding of the driver mutations, genetic backgrounds, and key pathways activated across cancer types. Analysis of TCGA datasets have mostly focused on somatic mutations and translocations, with less emphasis placed on gene amplifications. Here we describe a bioinformatics screening strategy to identify putative cancer driver genes amplified across TCGA datasets. We carried out GISTIC2 analysis of TCGA datasets spanning 16 cancer subtypes and identified 486 genes that were amplified in two or more datasets. The list was narrowed to 75 cancer-associated genes with potential "druggable" properties. The majority of the genes were localized to 14 amplicons spread across the genome. To identify potential cancer driver genes, we analyzed gene copy number and mRNA expression data from individual patient samples and identified 42 putative cancer driver genes linked to diverse oncogenic processes. Oncogenic activity was further validated by siRNA/shRNA knockdown and by referencing the Project Achilles datasets. The amplified genes represented a number of gene families, including epigenetic regulators, cell cycle-associated genes, DNA damage response/repair genes, metabolic regulators, and genes linked to the Wnt, Notch, Hedgehog, JAK/STAT, NF-KB and MAPK signaling pathways. Among the 42 putative driver genes were known driver genes, such as EGFR, ERBB2 and PIK3CA. Wild-type KRAS was amplified in several cancer types, and KRAS-amplified cancer cell lines were most sensitive to KRAS shRNA, suggesting that KRAS amplification was an independent oncogenic event. A number of MAP kinase adapters were co-amplified with their receptor tyrosine kinases, such as the FGFR adapter FRS2 and the EGFR family adapters GRB2 and GRB7. The ubiquitin-like ligase DCUN1D1 and the histone methyltransferase NSD3 were also identified as novel putative cancer driver genes. We discuss the patient tailoring implications for existing cancer drug targets and we further discuss potential novel opportunities for drug discovery efforts.

Conflict of interest statement

Competing Interests: This study was funded fully by Eli Lilly and Company, the employer of all authors. There are no patents, products in development or marketed products to declare. This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials, as detailed online in the guide for authors.

Figures

Figure 1
Figure 1. Flowscheme for the identification of cancer amplified genes with putative cancer driver activity.
TCGA datasets were mined for gene amplification (GISTIC2 analysis, cBio portal) and 461 gene amplifications were identified. The list was narrowed to 73 genes cancer-related genes that were potentially “druggable” based on external druggability databases. From the 73 genes, 40 putative cancer driver genes were identified based on copy number versus mRNA expression analysis of TCGA data.
Figure 2
Figure 2. Identification of 73 genes amplified in TCGA datasets.
From the initial list of 461 genes amplified in one or more TCGA datasets, 73 amplified genes were identified with potentially “druggable” properties as well as established/putative roles in oncogenesis. Genes/amplicons are arranged by chromosomal location, with their genomic location marked as shown (Mb = Megabase). Colored boxes indicate cancer types with TCGA designations, as follows: BLCA - Bladder Urothelial Carcinoma, BRCA - Breast invasive carcinoma, CRC – Colorectal Cancer (COAD and READ studies combined together), GBM - Glioblastoma multiforme, HNSC - Head and Neck squamous cell carcinoma, KIRC - Kidney renal clear cell carcinoma, LGG - Brain Lower Grade Glioma, LUAD - Lung adenocarcinoma, LUSC - Lung squamous cell carcinoma, OV - Ovarian serous cystadenocarcinoma, PRAD - Prostate adenocarcinoma, SKCM - Skin Cutaneous Melanoma, STAD - Stomach adenocarcinoma, UCEC - Uterine Corpus Endometrioid Carcinoma.
Figure 3
Figure 3. Gene copy number and mRNA expression correlation analysis to identify putative driver genes amplified on chromosomes 1–11.
Pearson correlation coefficients were calculated by analyzing gene copy number and mRNA expression from individual patient-derived samples in TCGA datasets. Shown are the correlation coefficients for each TCGA cancer subtype and the mean correlation across all cancer types (red denotes high correlation, blue denotes low correlation). Abbreviations of TCGA datasets are listed in Figure 1.
Figure 4
Figure 4. Gene copy number and mRNA expression correlation analysis to identify putative driver genes amplified on chromosomes 12–20.
Pearson correlation coefficients were calculated by analyzing gene copy number and mRNA expression from individual patient-derived samples in TCGA datasets. Shown are the correlation coefficients for each TCGA cancer subtype and the mean correlation across all cancer types (red denotes high correlation, blue denotes low correlation). Abbreviations of TCGA datasets are listed in Figure 1.
Figure 5
Figure 5. Cancer amplified genes in the MAP kinase pathway.
(A) KRAS shRNA activity in a panel of cancer cell lines (Project Achilles). shRNA score denotes the log2 based decrease in KRAS shRNA compared to pooled shRNA in cancer cell lines after several rounds of proliferation post-shRNA infection . A negative shRNA score suggests decreased cancer cell proliferation/survival after shRNA transfection. Yellow bars indicate cell lines with KRAS copy number >4 and black bars indicate cell lines with KRAS copy number <4. (B) Copy number (x-axis) and mRNA expression (y-axis) for KRAS in a panel of ovarian cancers. Correlation coefficient for copy number and mRNA expression are listed in the top right (r value). (C) Frequency of amplification (red bar), mutation (green bar), and deletion (blue bar) for KRAS in various cancers. The percentages shown reflect the overall rate of gene amplification, mutation and/or deletion in each cancer type. Vertical aligned bars reflect samples from the same patient. (D) KRAS copy number (x-axis) and KRAS relative protein level (y-axis) as measured by western blot in a panel of lung cancer cell lines grown in vitro. (E) Gene amplifications associated with sensitivity to KRAS shRNA in cancer cell lines (Project Achilles). Y-axis = Log10 Likelihood Ratio (LOD) of gene amplification being associated with shRNA score by comparing each gene amplification model to the “null model” without any gene amplification. (F) KRAS copy number (x-axis) and KRAS shRNA score (y-axis) for individual cancer cell lines color-coded by tumor type (data obtained from Project Achilles). Trendline shown for mean values in each copy number bin.
Figure 6
Figure 6. GRB7 and DCUN1D1 are novel cancer amplified genes with putative driver activity.
(A) GRB7 shRNA activity in a panel of cancer cell lines (Project Achilles). shRNA score denotes the log2 based decrease in GRB7 shRNA compared to pooled shRNA in cancer cell lines after several rounds of proliferation post-shRNA infection . A negative shRNA score suggests decreased cancer cell proliferation/survival after shRNA transfection. Yellow bars indicate cell lines with GRB7 copy number >4 and black bars indicate cell lines with GRB7 copy number <4. (B) Copy number (x-axis) and mRNA expression (y-axis) for GRB7 in a panel of breast cancers. Correlation coefficient for copy number and mRNA expression are listed in the top right (r value). (C) Frequency of amplification (red bar), mutation (green bar), and deletion (blue bar) for GRB7 and ERBB2 in various cancers. The percentages shown reflect the overall rate of gene amplification, mutation and/or deletion in each cancer type. Vertical aligned bars reflect samples from the same patient. (D) Copy number (x-axis) and mRNA expression (y-axis) for DCUN1D1 in lung squamous cancers. Correlation coefficient for copy number and mRNA expression is listed in the top right (r value). (E) Relative proliferation (y-axis) of cancer cell lines KYSE, T47D, SW48, and HCT15 cells 6 days after infection with DCUN1D1 lentiviral shRNA particles, as measured by Cell Titer Glo assay.
Figure 7
Figure 7. Epigenetic regulatory genes as putative cancer amplified driver genes.
(A) Copy number (x-axis) and mRNA expression (y-axis) for NSD3 and SETD1 in breast cancers and melanomas, respectively. Correlation coefficient for copy number and mRNA expression are listed in the top right (r value). (B) BRD4 and YEATS4 shRNA activity in a panel of cancer cell lines (Project Achilles). shRNA score denotes the log2 based decrease in the representative shRNA compared to pooled shRNA in cancer cell lines after several rounds of proliferation post-shRNA . Yellow bars indicate cell lines with BRD4 or YEATS4 copy number >4 and black bars indicate cell lines with BRD4 or YEATS4 copy number <4. (C) Frequency of amplification (red bar), mutation (green bar), and deletion (blue bar) for NSD3, SETDB1, YEATS4, and BRD4 in various cancers. The percentages shown reflect the overall rate of gene amplification, mutation and/or deletion in each cancer type. Vertical aligned bars reflect samples from the same patient. (D) Relative NSD3 protein level (y-axis, normalized to b-actin protein levels) compared with NSD3 copy number (x-axis) in SW48, H1581, SW837, and H1703 cells. (E) Relative proliferation (y-axis) and (F) relative apoptosis levels of cancer cell lines H1581, H1703, SW48, and SW837 cells 3 days after transfection with NSD3 siRNA, as measured by Cell Titer Glo and Caspase Glo assays, respectively. (G) Cell cycle profile of H1703 cells 24 or 48 hours after transfection with NSD3 siRNA compared to non-transfected controls. (H) Relative changes of cells in apoptosis, G1 or G2 phases (y-axis) in cell lines 48 hours-post NSD3 siRNA transfection compared to uninfected controls.

Similar articles

See all similar articles

Cited by 49 PubMed Central articles

See all "Cited by" articles

References

    1. The Cancer Genome Atlas (2012) Comprehensive genomic characterization of squamous cell lung cancers. Nature 489: 519–525. - PMC - PubMed
    1. The Cancer Genome Atlas (2008) Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455: 1061–1068. - PMC - PubMed
    1. The Cancer Genome Atlas (2012) Comprehensive molecular characterization of human colon and rectal cancer. Nature 487: 330–337. - PMC - PubMed
    1. The Cancer Genome Atlas (2012) Comprehensive molecular portraits of human breast tumours. Nature 490: 61–70. - PMC - PubMed
    1. The Cancer Genome Atlas (2011) Integrated genomic analyses of ovarian carcinoma. Nature 474: 609–615. - PMC - PubMed

Publication types

Grant support

This study was funded by Eli Lilly and Company. The funder provided support in the form of salaries for all authors, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the author contributions section.
Feedback