Purpose: The purpose of this study was to investigate, in a nationwide study, the inter-observer variation and performance in interpretations of bone scans regarding the presence or absence of bone metastases.
Methods: Bone scan images from 59 patients with breast or prostate cancer, who had undergone scintigraphy due to suspected bone metastatic disease, were studied. The patients were selected to reflect the spectrum of pathology found in everyday clinical work. Whole body images, anterior and posterior views, were sent to all 30 hospitals in Sweden that perform bone scans. Thirty-seven observers from 18 hospitals agreed to participate in the study. They were asked to classify each of the patient studies regarding the presence of bone metastasis, using a four-point scale. Each observer's classifications were pairwise compared with the classifications made by all the other observers, resulting in 666 pairs of comparisons. The interpretations of the 37 observers were also compared with the final clinical assessment, which was based on follow-up scans and other clinical data.
Results: On average, two observers agreed on 64% of the bone scan classifications. Kappa values ranged between 0.16 and 0.82, with a mean of 0.48. Sensitivity and specificity for the observers compared with the final clinical assessment were 77% and 96%, respectively, for detecting bone metastases in planar whole-body bone scanning.
Conclusion: Moderate inter-observer agreement was found when observers were compared pairwise. False-negative errors seem to be the major problem in the interpretations of bone scan images, whilst the specificities for the observers were high.