The SF-36 Health Survey as a generic outcome measure in clinical trials of patients with osteoarthritis and rheumatoid arthritis: tests of data quality, scaling assumptions and score reliability

Med Care. 1999 May;37(5 Suppl):MS10-22. doi: 10.1097/00005650-199905001-00002.


Objective: To evaluate the psychometric assumptions underlying the construction and scoring of SF-36 scales and summary measures among clinical trial participants with arthritis.

Methods: Cross-sectional SF-36 data from the baseline assessment of adult patients (n = 1,016) participating in four placebo-controlled clinical trials of treatment for arthritis were analyzed with blinding as to treatment. Tests of the completeness of data, scaling assumptions, internal-consistency reliability, and factor structure of SF-36 scales were performed for the combined sample. Eligible participants had at least a 6-month history of moderate to severe osteoarthritis or rheumatoid arthritis of the knee or hip. Participants meeting inclusion criteria had undergone a washout period of 3-14 days before baseline assessment to bring about a flare state in osteoarthritis or rheumatoid arthritis symptoms. Baseline sample sizes for the three osteoarthritis trials were n = 121, n = 341, and n = 187. The baseline sample size for the rheumatoid arthritis trial was n = 367. The average age of participants was 60 years, and the majority were females (72%). Measured were functional health and well-being scales and physical and mental health summary measures from the SF-36 Health Survey acute form.

Results: Missing responses ranged from 0.0% to 1.5% across SF-36 items, and scale scores could be computed for 96.8% to 100% of participants across trials. In all four trials, item internal consistency tests were passed (91.4%-97.1%) and item discriminant validity tests were passed (96.9%-100.0%). Across the four trials, internal-consistency reliability coefficients ranged from a low of 0.75 to a high of 0.91 for the eight scales (median = 0.84), exceeding the minimum standards for group comparisons. Ceiling effects were minimal for most scales, and floor effects were noteworthy for the role physical and role emotional scales. Physical and mental health factors identified in previous studies were replicated.

Conclusion: The SF-36 Health Survey proved to be a psychometrically sound tool for the assessment of the health status of adult participants in clinical trials of arthritis.

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Arthritis, Rheumatoid / drug therapy*
  • Clinical Trials as Topic / standards*
  • Clinical Trials as Topic / statistics & numerical data
  • Cross-Sectional Studies
  • Female
  • Health Status Indicators*
  • Humans
  • Male
  • Middle Aged
  • Osteoarthritis / drug therapy*
  • Outcome Assessment, Health Care / standards*
  • Outcome Assessment, Health Care / statistics & numerical data
  • Psychometrics
  • Reproducibility of Results
  • Surveys and Questionnaires / standards*
  • Time Factors