Objective: To compare the performance of the MOS SF12 health survey (SF12) with the SF36 in a sample of 233 patients with rheumatoid arthritis (RA) stratified by functional class.
Methods: The SF12 and SF36 physical and mental component summary scales (PCS and MCS) were compared for test retest reliability [intra-class correlation coefficient (RC) and repeatability], construct validity and responsiveness [standardized response mean (SRM)] to self-reported change in health.
Results: Overall, despite its brevity, the SF12 is comparable to the SF36 with only some loss of performance. The SF12-PCS is slightly less reliable (RC = 0.75) and responsive to improvements in health (SRM = 0.52) than the SF36-PCS (RC = 0.81; SRM = 0.61). The SF12-PCS correlates strongly with the SF36-PCS (R = 0.94), SF36 physical function subscale (R = 0.77) and modified Stanford Health Assessment Questionnaire (MHAQ) (R = 0.71), but only weakly with the SF36 mental health subscale (R = 0.22). SF12-PCS discriminated well between Steinbrocker functional classes; patients in functional classes 1-4, respectively, have SF12-PCS scores 1sigma, 2sigma, 2.4sigma and 2.7sigma below the population norm (ANOVA, F = 35.8, P < 0.000). The SF12-MCS is relatively unresponsive to reported improvement in RA (SRM = 0.31), but is reliable (RC = 0.71) and correlates well with the SF36-MCS (R = 0.71). SF12-MCS correlates more closely than the SF36-MCS with the SF36 mental health subscale (R = 0.86) and Hospital Anxiety and Depression (HAD) scale (R = 0.76). In ANOVA models, only the HAD (R2 = 57%) score contributes significantly to variance in SF12-MCS (F = 254.8; P < 0.000), but both the HAD (R2 = 24%) and MHAQ (R2 = 10%) scores contribute to variance in the SF36-MCS (F = 50.9; P < 0.000). Thus, the SF12-MCS has better construct validity for mental health than SF36-MCS in RA subjects. Missing responses to items were high amongst patients in functional class 4 (34%).
Conclusion: The SF12 is a reliable, valid and responsive measure of health status in the majority of RA patients, and meets standards required for comparing groups of patients. Its application in the most severely disabled subjects is uncertain.