Multi-observer concordance and accuracy of the British Thoracic Society scale and other visual assessment qualitative criteria for solid pulmonary nodule assessment using FDG PET-CT

Clin Radiol. 2020 Nov;75(11):878.e21-878.e28. doi: 10.1016/j.crad.2020.06.028. Epub 2020 Jul 22.

Abstract

Aim: To compare the interobserver reliability and diagnostic accuracy of the British Thoracic Society (BTS) scale and other visual assessment criteria in the context of 2-[18F]-fluoro-2-deoxy-d-glucose (FDG) positron-emission tomography (PET)-computed tomography (CT) evaluation of solid pulmonary nodules (SPNs).

Materials and methods: Fifty patients who underwent FDG PET-CT for assessment of a SPN were identified. Seven reporters with varied experience at four centres graded FDG uptake visually using the British Thoracic Society (BTS) four-point scale. Five reporters also scored SPNs according to three- and five-point visual assessment scales and using semi-quantitative assessment (maximum standardised uptake value [SUVmax]). Interobserver reliability was assessed with the intra-class correlation coefficient (ICC) and weighted Cohen's kappa (κ). Diagnostic performance was evaluated by receiver operator characteristic (ROC) analysis.

Results: Good interobserver reliability was demonstrated with the BTS scale (ICC=0.78, 95% confidence interval [CI]: 0.69-0.85) and five-point scale (ICC=0.78, 95 CI 0.68-0.86), whilst the three-point scale demonstrated moderate reliability (ICC=0.70, 95% CI: 0.59-0.80). Almost perfect agreement was achieved between two consultants (κ=0.85), and substantial agreement between two other consultants (κ=0.78) using the BTS scale. ROC curves for the BTS and five-point scales demonstrated equivalent accuracy (BTS area under the ROC curve [AUC]=0.768; five-point AUC=0.768). SUVmax was no more accurate compared to the BTS scale (SUVmax AUC=0.794; BTS AUC=0.768, p=0.43).

Conclusions: The BTS scale can be applied reliably by reporters with varied levels of PET-CT reporting experience, across different centres and has a diagnostic performance that is not surpassed by alternative scales.

MeSH terms

  • Aged
  • Female
  • Fluorodeoxyglucose F18
  • Humans
  • Lung / diagnostic imaging
  • Lung Neoplasms / diagnosis
  • Lung Neoplasms / diagnostic imaging*
  • Male
  • Middle Aged
  • Observer Variation
  • Positron Emission Tomography Computed Tomography* / methods
  • Positron Emission Tomography Computed Tomography* / standards
  • Positron Emission Tomography Computed Tomography* / statistics & numerical data
  • Reproducibility of Results
  • Solitary Pulmonary Nodule / diagnosis
  • Solitary Pulmonary Nodule / diagnostic imaging*

Substances

  • Fluorodeoxyglucose F18