Insertions and Deletions Target Lineage-Defining Genes in Human Cancers

Cell. 2017 Jan 26;168(3):460-472.e14. doi: 10.1016/j.cell.2016.12.025. Epub 2017 Jan 12.


Certain cell types function as factories, secreting large quantities of one or more proteins that are central to the physiology of the respective organ. Examples include surfactant proteins in lung alveoli, albumin in liver parenchyma, and lipase in the stomach lining. Whole-genome sequencing analysis of lung adenocarcinomas revealed noncoding somatic mutational hotspots near VMP1/MIR21 and indel hotspots in surfactant protein genes (SFTPA1, SFTPB, and SFTPC). Extrapolation to other solid cancers demonstrated highly recurrent and tumor-type-specific indel hotspots targeting the noncoding regions of highly expressed genes defining certain secretory cellular lineages: albumin (ALB) in liver carcinoma, gastric lipase (LIPF) in stomach carcinoma, and thyroglobulin (TG) in thyroid carcinoma. The sequence contexts of indels targeting lineage-defining genes were significantly enriched in the AATAATD DNA motif and specific chromatin contexts, including H3K27ac and H3K36me3. Our findings illuminate a prevalent and hitherto unrecognized mutational process linking cellular lineage and cancer.

Keywords: cancer cell of origin; cancer genomics; noncoding genetic variation; somatic mutational processes; statistical driver discovery; variant topography; whole-genome sequencing.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • 3' Untranslated Regions
  • Adult
  • Aged
  • Aged, 80 and over
  • Cell Lineage*
  • Female
  • Humans
  • INDEL Mutation*
  • Male
  • Membrane Proteins / genetics
  • MicroRNAs / genetics
  • Middle Aged
  • Mutation*
  • Neoplasms / genetics*
  • Neoplasms / pathology*
  • Nucleotide Motifs
  • Polymorphism, Single Nucleotide
  • Pulmonary Surfactant-Associated Proteins / genetics


  • 3' Untranslated Regions
  • MIRN21 microRNA, human
  • Membrane Proteins
  • MicroRNAs
  • Pulmonary Surfactant-Associated Proteins
  • VMP1 protein, human