Automated classification of benign and malignant lesions in 18 F-NaF PET/CT images using machine learning

Phys Med Biol. 2018 Nov 20;63(22):225019. doi: 10.1088/1361-6560/aaebd0.


Purpose: 18F-NaF PET/CT imaging of bone metastases is confounded by tracer uptake in benign diseases, such as osteoarthritis. The goal of this work was to develop an automated bone lesion classification algorithm to classify lesions in NaF PET/CT images.

Methods: A nuclear medicine physician manually identified and classified 1751 bone lesions in NaF PET/CT images from 37 subjects with metastatic castrate-resistant prostate cancer, 14 of which (598 lesions) were analyzed by three additional physicians. Lesions were classified on a five-point scale from definite benign to definite metastatic lesions. Classification agreement between physicians was assessed using Fleiss' κ. To perform fully automated lesion classification, three different lesion detection methods based on thresholding were assessed: SUV > 10 g ml-1, SUV > 15 g ml-1, and a statistically optimized regional thresholding (SORT) algorithm. For each ROI in the image, 172 different imaging features were extracted, including PET, CT, and spatial probability features. These imaging features were used as inputs into different machine learning algorithms. The impact of different deterministic factors affecting classification performance was assessed.

Results: The factors that most impacted classification performance were the machine learning algorithm and the lesion identification method. Random forests (RF) had the highest classification performance. For lesion segmentation, using SORT (AUC = 0.95 [95%CI = 0.94-0.95], sensitivity = 88% [86%-90%], and specificity = 0.89 [0.87-0.90]) resulted in superior classification performance (p < 0.001) compared to SUV > 10 g ml-1 (AUC = 0.87) and SUV > 15 g ml-1 (AUC = 0.86). While there was only moderate agreement between physicians in lesion classification (κ = 0.53 [95% CI = 0.52-0.53]), classification performance was high using any of the four physicians as ground truth (AUC range: 0.91-0.93).

Conclusion: We have developed the first whole-body automatic disease classification tool for NaF PET using RF, and demonstrated its ability to replicate different physicians' classification tendencies. This enables fully-automated analysis of whole-body NaF PET/CT images.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Automation
  • Bone Neoplasms / diagnostic imaging*
  • Bone Neoplasms / secondary
  • Fluorine Radioisotopes*
  • Humans
  • Image Processing, Computer-Assisted / methods*
  • Machine Learning*
  • Male
  • Positron Emission Tomography Computed Tomography*
  • Prostatic Neoplasms, Castration-Resistant / pathology
  • Sensitivity and Specificity
  • Sodium Fluoride*


  • Fluorine Radioisotopes
  • Sodium Fluoride
  • Fluorine-18