ComBat harmonization for radiomic features in independent phantom and lung cancer patient computed tomography datasets

Phys Med Biol. 2020 Jan 13;65(1):015010. doi: 10.1088/1361-6560/ab6177.

Abstract

This work seeks to evaluate the combatting batch effect (ComBat) harmonization algorithm's ability to reduce the variation in radiomic features arising from different imaging protocols and independently verify published results. The Gammex computed tomography (CT) electron density phantom and Quasar body phantom were imaged using 32 different chest imaging protocols. 107 radiomic features were extracted from 15 spatially varying spherical contours between 1.5 cm and 3 cm in each of the lung300 density, lung450 density, and wood inserts. The Kolmogorov-Smirnov test was used to determine significant differences in the distribution of the features and the concordance correlation coefficient (CCC) was used to measure the repeatability of the features from each protocol variation class (kVp, pitch, etc) before and after ComBat harmonization. P-values were corrected for multiple comparisons using the Benjamini-Hochberg-Yekutieli procedure. Finally, the ComBat algorithm was applied to human subject data using six different thorax imaging protocols with 135 patients. Spherical contours of un-irradiated lung (2 cm) and vertebral bone (1 cm) were used for radiomic feature extraction. ComBat harmonization reduced the percentage of features from significantly different distributions to 0%-2% or preserved 0% across all protocol variations for the lung300, lung450 and wood inserts. For the human subject data, ComBat harmonization reduced the percentage of significantly different features from 0%-59% for bone and 0%-19% for lung to 0% for both. This work verifies previously published results and demonstrates that ComBat harmonization is an effective means to harmonize radiomic features extracted from different imaging protocols to allow comparisons in large multi-institution datasets. Biological variation can be explicitly preserved by providing the ComBat algorithm with clinical or biological variables to protect. ComBat harmonization should be tested for its effect on predictive models.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Aged, 80 and over
  • Algorithms*
  • Carcinoma, Non-Small-Cell Lung / diagnostic imaging
  • Carcinoma, Non-Small-Cell Lung / pathology*
  • Carcinoma, Non-Small-Cell Lung / surgery
  • Datasets as Topic
  • Female
  • Humans
  • Image Processing, Computer-Assisted / standards*
  • Longitudinal Studies
  • Lung Neoplasms / diagnostic imaging
  • Lung Neoplasms / secondary*
  • Lung Neoplasms / surgery
  • Male
  • Middle Aged
  • Phantoms, Imaging*
  • Radiosurgery / methods
  • Retrospective Studies
  • Tomography Scanners, X-Ray Computed / standards*
  • Tomography, X-Ray Computed / methods*