Aims: To examine the level of agreement among observers regarding changes between serial images of bone metastases.
Methods: Thirty-five pairs of bone X-rays and 30 pairs of bone scans were selected from the files of patients with breast cancer involving the skeleton. All images in a pair were of the same site and had been taken at least 12 weeks apart. Thirteen radiologists and 14 nuclear medicine physicians examined the X-ray and bone scan pairs, respectively. Each assessed whether the changes between sequential films represented improvement, stability or worsening. Inter-observer agreement was analysed using the kappa statistic (kappa).
Results: There was only fair overall agreement among radiologists regarding changes between X-rays (kappa = 0.23), but there was substantial agreement among nuclear medicine physicians for bone scan assessments (kappa = 0.62). Neither the experience of the observers nor the time between images had a significant effect on agreement. For X-rays, agreement was poorer if the response category was 'improvement' and if the type of bone lesion was mixed lytic/sclerotic.
Conclusions: Evaluation of serial X-rays is unreliable for determining the response of bone metastases. Scintigraphic evaluation has a higher internal validity for the determination of response, but it should not be used in isolation from other clinical data.