Comparison of rheumatoid arthritis clinical trial outcome measures: a simulation study

Arthritis Rheum. 2003 Nov;48(11):3031-8. doi: 10.1002/art.11293.


Objective: Isolated studies have suggested that continuous measures of response may be better than predefined, dichotomous definitions (e.g., the American College of Rheumatology 20% improvement criteria [ACR20]) for discriminating between rheumatoid arthritis (RA) treatments. Our goal was to determine the statistical power of predefined dichotomous outcome measures (termed "a priori"), compared with that of continuous measures derived from trial data in which there was no predefined response threshold (termed "data driven"), and to evaluate the sensitivity to change of these measures in the context of different treatments and early versus later-stage disease. In order to generalize beyond results from a single trial, we performed simulation studies.

Methods: We obtained summary data from trials comparing disease-modifying antirheumatic drugs (DMARDs) and from comparative coxib-placebo trials to test the power of 2 a priori outcomes, the ACR20 and improvement of the Disease Activity Score (DDAS), as well as 2 data-driven outcomes. We studied patients with early RA and those with later-stage RA (duration of <4 years and 4-9 years, respectively). We performed simulation studies, using the interrelationship of ACR core set measures in the trials to generate multiple trial data sets consistent with the original data.

Results: The data-driven outcomes had greater power than did the a priori measures. The DMARD comparison was more powerful in early disease than in later-stage disease (the sample sizes needed to achieve 80% power for the most powerful test were 64 for early disease versus 100 for later disease), but the coxib-versus-placebo comparison was less powerful in early disease than in later disease (the sample sizes needed to achieve 80% power were 200 and 100, respectively). When the effects of treatment on core set items were small and/or inconsistent, power was reduced, particularly for a less broadly based outcome (e.g., DDAS) compared with the ACR20.

Conclusion: The simulation studies demonstrate that data-driven outcome definitions can provide better sensitivity to change than does the ACR20 or DDAS. Using such methods would improve power, but at the expense of trial standardization. The studies also show how patient population and treatment characteristics affect the power of specific outcome measures in RA clinical trials, and provide quantification of those effects.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Antirheumatic Agents / therapeutic use*
  • Arthritis, Rheumatoid / drug therapy*
  • Arthritis, Rheumatoid / physiopathology
  • Auranofin / therapeutic use
  • Clinical Trials as Topic*
  • Computer Simulation*
  • Cyclooxygenase Inhibitors / therapeutic use
  • Female
  • Humans
  • Lactones / therapeutic use
  • Male
  • Methotrexate / therapeutic use
  • Middle Aged
  • Models, Statistical*
  • Sensitivity and Specificity
  • Severity of Illness Index
  • Sulfones
  • Treatment Outcome


  • Antirheumatic Agents
  • Cyclooxygenase Inhibitors
  • Lactones
  • Sulfones
  • rofecoxib
  • Auranofin
  • Methotrexate