Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation


Lung adenocarcinoma, the most common subtype of non-small cell lung cancer, is responsible for more than 500,000 deaths per year worldwide. Here, we report exome and genome sequences of 183 lung adenocarcinoma tumor/normal DNA pairs. These analyses revealed a mean exonic somatic mutation rate of 12.0 events/megabase and identified the majority of genes previously reported as significantly mutated in lung adenocarcinoma. In addition, we identified statistically recurrent somatic mutations in the splicing factor gene U2AF1 and truncating mutations affecting RBM10 and ARID1A. Analysis of nucleotide context-specific mutation signatures grouped the sample set into distinct clusters that correlated with smoking history and alterations of reported lung adenocarcinoma genes. Whole-genome sequence analysis revealed frequent structural rearrangements, including in-frame exonic alterations within EGFR and SIK2 kinases. The candidate genes identified in this study are attractive targets for biological characterization and therapeutic targeting of lung adenocarcinoma.


Figure 1
Figure 1. Mutation spectrum analysis of 183 lung adenocarcinomas
(A) Hierarchical clustering of 183 lung adenocarcinomas according to their nucleotide context-specific exonic mutation rates. Each column represents a case, and each row represents one of 96 strand-collapsed trinucleotide context mutation signatures. Top bar: patient-cluster membership. Left bar: simplified single-nucleotide context mutational signature. Bottom bars: reported tumor stage, age, and smoking status for each patient. Right gradient: mutation rate scale. (B) Stratification of reported versus imputed smoking status by the log transform of the adjusted ratio of C->A tranversion rates and CpG->T transition rates. The color of each inner solid point represents the reported smoking status for that particular patient. The color of each outer circle indicates that patient's imputed smoking status as predicted by the classifier. Additional analytic details are provided in the Extended Experimental Procedures.
Figure 2
Figure 2. Somatic mutations and copy number changes in 183 lung adenocarcinomas
Top panel, summary of exonic somatic mutation of 25 significantly mutated genes (see text and Table S3 for details). Tumors are arranged from left to right by the number of non-silent mutations per sample, shown in the top track. Significantly mutated genes are listed vertically in decreasing order of non-silent mutation prevalence in the sequenced cohort. Colored rectangles: mutation category observed in a given gene and tumor. Bar chart (right): prevalence of each mutation category in each gene. Asterisks indicate genes significantly enriched in truncating (nonsense, frameshift) mutations. Middle bars: smoking status and mutation spectrum cluster for each patient. White boxes indicate unknown status. Bottom panel: summary of somatic copy number alterations derived from SNP array data. Colored rectangles indicate the copy number change seen for a given gene and tumor.
Figure 3
Figure 3. Somatic mutations of lung adenocarcinoma candidate genes U2AF1, RBM10, and ARID1A
(A) Schematic representation of identified somatic mutations in U2AF1 shown in the context of the known domain structure of the protein. Numbers refer to amino acid residues. Each rectangle corresponds to an independent, mutated tumor sample. Silent mutations are not shown. Missense mutations are shown in black. (B) Schematic of somatic RBM10 mutations. Splice site mutations are shown in purple; truncating mutations are shown in red. Other notations as in (A). (C) Schematic of somatic ARID1A mutations. Notations as in (A) and (B).
Figure 4
Figure 4. Whole genome sequencing of lung adenocarcinoma
(A) Summary of genic rearrangement types across 25 lung adenocarcinoma whole genomes. Stacked-bar plot depicting the types of somatic rearrangement found in annotated genes by analysis of whole genome sequence data from 25 tumor/normal pairs. The “Other Genic” category refers to rearrangements linking an intergenic region to the 3’ portion of a genic footprint. (B) Representative Circos (Krzywinski et al., 2009) plots of whole genome sequence data with rearrangements targeting known lung adenocarcinoma genes CDKN2A, STK11 and EGFR and novel genes MAST2, SIK2, and ROCK1. Chromosomes are arranged circularly end-to-end with each chromosome's cytobands marked in the outer ring. The inner ring displays copy number data inferred from whole genome sequencing with intrachromosomal events in green and interchromosomal translocations in purple.
Figure 5
Figure 5. Identification of a novel lung adenocarcinoma in-frame deletion in EGFR
(A) Schematic representation of reported EGFR alterations (above protein model) for comparison with a C-terminal deletion event found in this study by whole genome sequencing (below protein model). A schematic depiction of sequencing data shows the expected wild-type reads (gray) in contrast with the observed reads (black) spanning or split by the deletion breakpoint. Supporting paired-end and split read mapping data are shown are Figure S5. (B) Soft agar colony forming assay of NIH-3T3 cells expressing exon 25 and 26-deleted EGFR (Ex25&26del) or wild-type EGFR in the presence or absence of ligand stimulation. The bar graph shows the number of colonies formed by indicated cells with or without EGF in soft agar (n=3, mean +SD). (C) Ex25&26del EGFR is constitutively active in the absence of EGF. The same NIH-3T3 cells used for the assay in (B) were subjected to immunoblotting with anti- phospho-tyrosine (4G10), anti-EGFR and anti-phospho-Akt (S473) antibodies. Blots were probed with anti-Akt and anti-B-actin antibodies (loading control). (D) Cell growth induced by the oncogenic EGFR deletion mutant is suppressed by erlotinib treatment. Ba/F3 cells transformed by either L858R or Ex25&26del mutants were treated with increasing concentrations of erlotinib as indicated for 72 hrs and were assayed for cell viability.
Figure 6
Figure 6. Next-generation hallmarks of lung adenocarcinoma
Left, the prevalence of mutation or SCNA of Sanger Cancer Gene Census (Futreal et al., 2004) genes mapping to cancer hallmarks defined by (Hanahan and Weinberg, 2011). Suspected passenger mutations were filtered out of the analysis, as described in Experimental Procedures. Top right, genes comprising the mutated genes in the hallmark of “sustaining proliferative signaling” are shown. Bottom right, a proposed “11th hallmark” of Epigenetic and RNA deregulation is shown, depicted as above. Genes shown in gray are candidate lung adenocarcinoma genes identified in this study that may additionally contribute to the hallmark.

Similar articles

See all similar articles

Cited by 685 articles

See all "Cited by" articles

Publication types

MeSH terms