Objective: The main objective of the present study was to test the interobserver reliability, truth, discrimination and feasibility of two scoring methods available in ankylosing spondylitis (AS) over a follow-up period of 3 years.
Methods: Two blinded trained observers scored 95 AS radiographs from a cohort of AS patients. Each radiograph was scored by two scoring methods, the modified Stoke Ankylosing Spondylitis Spine Score (mSASSS), and the Bath Ankylosing Spondylitis Radiology Index--spine (BASRI-spine). Interobserver agreement was analyzed by intraclass correlation coefficients (ICC). The construct validity was assessed by examining the correlation of the scoring methods with measures of spinal mobility (Bath Ankylosing Spondylitis Metronomy Index--BASMI), functional limitation (Bath Ankylosing Spondylitis Functional Index--BASFI) and disease duration. Bland and Altman's 95% limits of agreement method and effect size (ES) analysis were used to estimate the smallest detectable difference (SDD) of radiological progression and responsiveness.
Results: The BASRI-spine reached intra- and interobserver ICC of 0.755 and 0.831, respectively. The mSASSS scores were more reliable, with ICC of 0.874 and 0.941, respectively. Both scoring systems correlated significantly with BASMI (p = 0.01), while only the mSASSS showed a significant correlation (p = 0.02) with BASFI. With regards to sensitivity to change, it was found that mSASSS classified the highest percentage of patients with more changes than the BASRI-spine (mSASSS: 35.8% vs. BASRI-spine: 15.8%). The ES analysis also suggested that the mSASSS was more responsive than BASRI-spine. Concerning feasibility, the BASRI-spine takes less time for scoring.
Conclusion: We have shown that the mSASSS offers advantages in measurement properties and is the most appropriate method by which to assess progression of structural damage in AS.