Predicting Molecular Phenotypes from Histopathology Images: A Transcriptome-Wide Expression-Morphology Analysis in Breast Cancer

Cancer Res. 2021 Oct 1;81(19):5115-5126. doi: 10.1158/0008-5472.CAN-21-0482. Epub 2021 Aug 2.


Molecular profiling is central in cancer precision medicine but remains costly and is based on tumor average profiles. Morphologic patterns observable in histopathology sections from tumors are determined by the underlying molecular phenotype and therefore have the potential to be exploited for prediction of molecular phenotypes. We report here the first transcriptome-wide expression-morphology (EMO) analysis in breast cancer, where individual deep convolutional neural networks were optimized and validated for prediction of mRNA expression in 17,695 genes from hematoxylin and eosin-stained whole slide images. Predicted expressions in 9,334 (52.75%) genes were significantly associated with RNA sequencing estimates. We also demonstrated successful prediction of an mRNA-based proliferation score with established clinical value. The results were validated in independent internal and external test datasets. Predicted spatial intratumor variabilities in expression were validated through spatial transcriptomics profiling. These results suggest that EMO provides a cost-efficient and scalable approach to predict both tumor average and intratumor spatial expression from histopathology images. SIGNIFICANCE: Transcriptome-wide expression morphology deep learning analysis enables prediction of mRNA expression and proliferation markers from routine histopathology whole slide images in breast cancer.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers, Tumor*
  • Breast Neoplasms / etiology
  • Breast Neoplasms / metabolism*
  • Breast Neoplasms / pathology*
  • Computational Biology / methods
  • Databases, Genetic
  • Female
  • Gene Expression Profiling
  • High-Throughput Nucleotide Sequencing
  • Histocytochemistry / methods
  • Humans
  • Image Processing, Computer-Assisted
  • Molecular Imaging* / methods
  • Reproducibility of Results
  • Software
  • Transcriptome


  • Biomarkers, Tumor