Objective: To evaluate and improve the reliability of power Doppler ultrasonography (PDUS) for detecting and scoring enthesitis in patients with spondylarthitis, using a 3-step procedure.
Methods: In the first step, we evaluated the reliability of 5 sonographers by bilaterally scanning 5 entheses twice in 5 patients. In the second step, starting from disagreements observed during the first step, we established consensus guidelines. The sonographers' implementation was further evaluated in 2 reliability exercises: one on 60 PDUS enthesitis images and the other by scanning 5 new patients. In the third step, we performed a final reliability evaluation of 5 additional patients after 1 year. Kappa coefficients (kappa) as well as variance component analysis (VCA) and generalizability theory (GT) were used to assess reliability.
Results: The initial intra- and interobserver reliability were poor, especially for detecting and scoring Doppler signal. VCA and GT showed that most variability was accounted for by interaction between sonographer and enthesis. Implementation of consensus guidelines was associated with a significant improvement in Doppler reliability between the first and second steps (mean interobserver kappa increased from 0.13 to 0.51 for binary Doppler scoring in patients; P < 0.005), which persisted in the third step (mean interobserver kappa = 0.57). The high GT coefficients reached in the last steps supported such improvement.
Conclusion: The 3-step procedure used in this study to standardize PDUS technique was associated with a significant improvement in interobserver reliability for detecting enthesitis in spondylarthritis patients. Such an approach can be useful to standardize PDUS assessment of musculoskeletal disorders.