Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 5 (186), 186ra66

Ectopic Activation of Germline and Placental Genes Identifies Aggressive Metastasis-Prone Lung Cancers


Ectopic Activation of Germline and Placental Genes Identifies Aggressive Metastasis-Prone Lung Cancers

Sophie Rousseaux et al. Sci Transl Med.


Activation of normally silent tissue-specific genes and the resulting cell "identity crisis" are the unexplored consequences of malignant epigenetic reprogramming. We designed a strategy for investigating this reprogramming, which consisted of identifying a large number of tissue-restricted genes that are epigenetically silenced in normal somatic cells and then detecting their expression in cancer. This approach led to the demonstration that large-scale "off-context" gene activations systematically occur in a variety of cancer types. In our series of 293 lung tumors, we identified an ectopic gene expression signature associated with a subset of highly aggressive tumors, which predicted poor prognosis independently of the TNM (tumor size, node positivity, and metastasis) stage or histological subtype. The ability to isolate these tumors allowed us to reveal their common molecular features characterized by the acquisition of embryonic stem cell/germ cell gene expression profiles and the down-regulation of immune response genes. The methodical recognition of ectopic gene activations in cancer cells could serve as a basis for gene signature-guided tumor stratification, as well as for the discovery of oncogenic mechanisms, and expand the understanding of the biology of very aggressive tumors.

Conflict of interest statement

Competing interests: The following patent applications include results presented in the paper: PCT/EP2009/053809, PCT/EP2011/068375, and PCT/EP2011/068377.


Fig. 1
Fig. 1. Ectopic expressions of TS/PS genes detected in multiple cancers
(A) Frequencies of activation of TS/PS genes in 14 different types of solid tumors from our analysis of transcriptomic data available online obtained from 1776 solid tumor samples (GSE2109, described in table S3) shown here for the 65 most frequently activated genes on a black (0%) to red (100%) scale (heat map). Overall frequencies of gene activation in all tumor samples are presented in the histogram on the right. (B) Transcriptomic profiles of TS/PS genes on a dedicated microarray in normal human tissues and in tumor samples and cell lines: “P,” “T,” “Ctrl soma,” “Cancer,” and “CCL” respectively indicate placenta, testis (n = 2), adult somatic tissues, cancer samples of various origins, and cancer cell lines (n = 3). All samples are listed in table S5. Color code: black, no expression (“Off”); red, gene activation (“On”). (C) Expression of a selection of 13 frequently expressed TS/PS genes detected by qRT-PCR in 73 tumor samples of eight different origins (including breast, colon, kidney, liver, lung, ovarian, prostate, and thyroid). The corresponding nontumor samples (“N”) are shown on the left part of the heat map in the same order.
Fig. 2
Fig. 2. TS/PS genes with hypermethylated promoters more susceptible to deregulation in cancer
(A) Box plots showing the respective distributions of activation frequencies in cancer (considering all 1776 cases of solid tumors analyzed in the data set GSE2109) of the two groups of genes, associated either with CpG-poor promoters (left) or with hypermethylated CpG-rich promoters (right). (B) Activation of TS/PS genes in response to DNA demethylation in HCT116 DNMT DKO (HCT116 cell line with double inactivation of DNMT1 and DNMT3b) compared to HCT116+/+ (wild type) using our dedicated microarray; the histogram shows the numbers of genes activated (black) or not (gray) according to their promoter category, either CpG-poor or CpG-rich hypermethylated (“CpG-rich HyperMe”). (C) qRT-PCR detecting the expression of 49 TS/PS genes among those associated with a CpG-rich hypermethylated promoter (listed on the x axis of the histogram) in HCT116 cell line wild type (+/+, light gray bars), KO for DNMT3b (dark gray bars), and DKO for DNMT1 and DNMT3b (red bars); values are fold changes in reference to the normalized values obtained in HCT116+/+.
Fig. 3
Fig. 3. Ectopic expression of TS/PS genes in the series of 293 lung cancer patients at all stages of the disease
(A) The heat map (left panel) shows the detection of TS/PS gene expression in all 293 patients (x axis) including adenocarcinoma (ADC), basaloid (BAS), carcinoid tumors (CARCI), large cell neuroendocrine tumors (LCNE), small cell carcinoma (SCC), and squamous cell carcinoma (SQC) histological subtypes, as well as in NL samples. Heat map color code: black, no activation; red, activation. The histogram (right panel) shows the frequency of lung cancer tumors (x axis, in %) aberrantly expressing each of the same TS/PS genes (y axis). (B) Heat map showing the expression of TS/PS genes focusing on the early lung cancer (T1N0) cases (n = 152) of the series, as described above; all patients are included in (A), but here they are sorted by decreasing number of ectopically expressed genes (not by histological subtypes). (C) Heat map focusing on the expression of TS/PS genes in a subset of 15 paired tumor samples (“Paired T”) and their corresponding NL [these samples are also shown in the previous two heat maps in (A) and (B), but in different order because they are of different TNM stages and histology]; this figure shows only the genes activated in this subset of patients. (D) Heat map showing the methylation levels (β values from 0 to 100% on a light gray to blue color scale) of 347 CpGs associated with the transcription start site (TSS) and 5′ untranslated regions of 88 TS/PS genes in normal somatic tissue samples (from left to right: mean methylation value in adipose tissue, adrenal gland, bladder, blood, brain, heart, lymph node, pancreas, skeletal muscle, spleen, stomach, ureter, and five fetal and two adult lung samples; data available on the GEO Web site under the reference GSE31848), as well as in 55 lung tumor samples of our series. β Values are shown in table S9. (E) Scatter plots corresponding to the individual CpGs localized near the TSS (−1500 to +1500 base pairs) of two genes, BRDT and MAGEB6, showing that the methylation levels (β values on the y axis) correlated with their respective expression levels (log2 ratios) in the 55 lung cancer patients. The positions of the CpGs relative to the TSS are indicated between brackets.
Fig. 4
Fig. 4. Off-contextactivationsof26TS/PSgenesindependentlyassociatedwithpoorprognosis in lung cancer
(A and B) Cumulative global Kaplan-Meier survival estimates of the 293 patients in our series either grouped together (A, left panel) or divided into three groups according to the number of ectopic expressions found within the subset of 26 genes. The groups were defined as follows: P1 (no expression, red curve, n = 121), P2 (one or two expressed genes, blue curve, n = 125), and P3 (three or more ectopically expressed genes, black curve, n = 47) (A, right panel). (B) The left panel shows the survival probabilities of patients according to the TNM stage (as indicated). The middle panel shows the survival probabilities of the three groups (P1, P2, and P3) defined by our classifying genes, considering only the T1N0 patients. The same approach was used to classify the T>1/N>0 patients (right panel). (C) Forest plot of HRs (P3 versus P1; on a log scale) for overallriskofdeath(more than5 years). A univariateCoxproportional hazard model estimated HRs for the overall risk of death. The horizontal lines provide the 95% confidence interval for the ratios. The vertical red dotted double-arrow line corresponds to an HR of 1. (D) Histograms showing the frequencies of relapse (local recurrence and/or metastasis) observed in P1 and P3 patients. (E) Box plots showing the distribution of times (in months) for P1 and P3 patients, corresponding to overall survival (left), survival before relapse (middle), and survival after the diagnosis of relapse (right).
Fig. 5
Fig. 5. Validation of the 26-gene prognosis-classifying strategy for lung cancer patients
(A) qRT-PCR detection of the expression of our classifying genes. The heat map (left) shows the detection of four frequently expressed TS/PS genes (upper panel) and of the prognosis-classifying genes (lower panel) in a subset of 61 patients from our lung cancer patient series. The survival curves (right) compare the survival probabilities between patients assigned to the P1, P2, and P3 groups by qRT-PCR. (B) Cumulative global Kaplan-Meier survival estimates of the patients from two external lung cancer populations either combined (top panels: the continuous lines show the mean survival probability, and the dotted lines correspond to the 95th percentile) or divided into the three groups P1, P2, and P3 defined as in Fig. 4A.
Fig. 6
Fig. 6. Biological and clinical characteristics of aggressive lung tumors revealed by differential expression profiling
(A) Heat maps showing the expression of the 26 TS/PS genes (upper panel), as well as genes up-regulated (middle part) and down-regulated (lower part) in aggressive tumors (P3) compared with the “good prognostic” group of tumors (P1). The robust multiarray average normalized values of expression of these genes are represented in the indicated color scales (green, low expression; red, high expression); the patients, represented on the x axis, were classified by prognostic groups and ranked by increasing value of the differential expression between up-and down-regulated genes; genes were ranked by decreasing difference of expression values between P1 and P3 samples. Blue frame: subset of the patients classified as group P1 presenting a P3-like molecular profile (P1P3L); gray frame: carcinoid tumors displaying an atypical molecular profile. (B) The Kaplan-Meier curves represent the respective survival probabilities of the subgroups of P1 patients with the P3-like expression profile (P1P3L, blue curve), other P1 patients (red curve), and P3 patients (black curve). (C to F) Enrichment plots displaying the normalized enriched scores of some of the highly significant overlapping gene sets identified with gene set enrichment analysis (GSEA). The green curve shows the running enrichment score (y axis) for the gene set as the analysis walks down the ranked list of genes (x axis). The black bars along the x axis represent the genes of the gene set, ranked according to their fold change of expression in the “P3 versus P1” transcriptomic analysis (from left to right: up- to down-regulated genes).

Similar articles

See all similar articles

Cited by 109 PubMed Central articles

See all "Cited by" articles

Publication types

MeSH terms

Associated data