Deep learning integrates histopathology and proteogenomics at a pan-cancer level

Cell Rep Med. 2023 Sep 19;4(9):101173. doi: 10.1016/j.xcrm.2023.101173. Epub 2023 Aug 14.


We introduce a pioneering approach that integrates pathology imaging with transcriptomics and proteomics to identify predictive histology features associated with critical clinical outcomes in cancer. We utilize 2,755 H&E-stained histopathological slides from 657 patients across 6 cancer types from CPTAC. Our models effectively recapitulate distinctions readily made by human pathologists: tumor vs. normal (AUROC = 0.995) and tissue-of-origin (AUROC = 0.979). We further investigate predictive power on tasks not normally performed from H&E alone, including TP53 prediction and pathologic stage. Importantly, we describe predictive morphologies not previously utilized in a clinical setting. The incorporation of transcriptomics and proteomics identifies pathway-level signatures and cellular processes driving predictive histology features. Model generalizability and interpretability is confirmed using TCGA. We propose a classification system for these tasks, and suggest potential clinical applications for this integrated human and machine learning approach. A publicly available web-based platform implements these models.

Keywords: CPTAC; cancer imaging; cancer proteogenomics; computational pathology; molecular diagnostics.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Deep Learning*
  • Humans
  • Machine Learning
  • Neoplasms* / genetics
  • Proteogenomics*
  • Proteomics