A machine learning approach for somatic mutation discovery

Sci Transl Med. 2018 Sep 5;10(457):eaar7939. doi: 10.1126/scitranslmed.aar7939.


Variability in the accuracy of somatic mutation detection may affect the discovery of alterations and the therapeutic management of cancer patients. To address this issue, we developed a somatic mutation discovery approach based on machine learning that outperformed existing methods in identifying experimentally validated tumor alterations (sensitivity of 97% versus 90 to 99%; positive predictive value of 98% versus 34 to 92%). Analysis of paired tumor-normal exome data from 1368 TCGA (The Cancer Genome Atlas) samples using this method revealed concordance for 74% of mutation calls but also identified likely false-positive and false-negative changes in TCGA data, including in clinically actionable genes. Determination of high-quality somatic mutation calls improved tumor mutation load-based predictions of clinical outcome for melanoma and lung cancer patients previously treated with immune checkpoint inhibitors. Integration of high-quality machine learning mutation detection in clinical next-generation sequencing (NGS) analyses increased the accuracy of test results compared to other clinical sequencing analyses. These analyses provide an approach for improved identification of tumor-specific mutations and have important implications for research and clinical management of cancer patients.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Exome / genetics
  • Exome Sequencing
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Immunotherapy
  • Machine Learning*
  • Mutation / genetics*
  • Neoplasms / genetics
  • Neoplasms / immunology
  • Neoplasms / therapy
  • Software