Metastasis detection from whole slide images using local features and random forests

Cytometry A. 2017 Jun;91(6):555-565. doi: 10.1002/cyto.a.23089. Epub 2017 Apr 20.


Digital pathology has led to a demand for automated detection of regions of interest, such as cancerous tissue, from scanned whole slide images. With accurate methods using image analysis and machine learning, significant speed-up, and savings in costs through increased throughput in histological assessment could be achieved. This article describes a machine learning approach for detection of cancerous tissue from scanned whole slide images. Our method is based on feature engineering and supervised learning with a random forest model. The features extracted from the whole slide images include several local descriptors related to image texture, spatial structure, and distribution of nuclei. The method was evaluated in breast cancer metastasis detection from lymph node samples. Our results show that the method detects metastatic areas with high accuracy (AUC = 0.97-0.98 for tumor detection within whole image area, AUC = 0.84-0.91 for tumor vs. normal tissue detection) and that the method generalizes well for images from more than one laboratory. Further, the method outputs an interpretable classification model, enabling the linking of individual features to differences between tissue types. © 2017 International Society for Advancement of Cytometry.

Keywords: breast cancer; computer aided diagnosis; digital pathology; machine learning; metastasis detection; random forest; sentinel lymph nodes; whole slide images.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Area Under Curve
  • Breast Neoplasms / diagnosis*
  • Breast Neoplasms / pathology
  • Cell Nucleus / pathology
  • Cell Nucleus / ultrastructure
  • Eosine Yellowish-(YS)
  • Female
  • Hematoxylin
  • Histocytochemistry / statistics & numerical data*
  • Humans
  • Image Interpretation, Computer-Assisted / methods*
  • Lymph Nodes / diagnostic imaging*
  • Lymph Nodes / pathology
  • Lymphatic Metastasis
  • Lymphocytes / pathology
  • Lymphocytes / ultrastructure
  • Machine Learning*
  • Middle Aged
  • ROC Curve
  • Software


  • Eosine Yellowish-(YS)
  • Hematoxylin