Predictive Validity of the Beers and Screening Tool of Older Persons' Potentially Inappropriate Prescriptions (STOPP) Criteria to Detect Adverse Drug Events, Hospitalizations, and Emergency Department Visits in the United States

J Am Geriatr Soc. 2016 Jan;64(1):22-30. doi: 10.1111/jgs.13884.


Objectives: To compare the predictive validity of the 2003 Beers, 2012 American Geriatrics Society (AGS) Beers, and Screening Tool of Older Persons' potentially inappropriate Prescriptions (STOPP) criteria.

Design: Retrospective cohort.

Setting: Managed care administrative claims data from 2006 to 2009.

Participants: Commercially insured persons aged 65 and older in the United States (N=174,275).

Measurements: Association between adverse drug events (ADEs), emergency department (ED) visits, and hospitalization outcomes and inappropriate medication use using time-varying Cox proportional hazard models. Measures of model discrimination (c-index) and hazard ratios (HRs) were calculated to compare unadjusted and adjusted models for associations.

Results: The prevalence of inappropriate prescribing was 34.1% for the 2012 AGS Beers criteria, 32.2% for the 2003 Beers criteria, and 27.6% for the STOPP criteria. Each set of criteria modestly discriminated ADEs in unadjusted analyses (STOPP criteria: hazard ratio (HR)=2.89, 95% confidence interval (CI)=2.68-3.12, C-index=0.607; 2012 AGS Beers criteria: HR=2.51, 95% CI=2.33-2.70, C-index=0.603; 2003 Beers criteria: HR=2.65, 95% CI=2.46-2.85, C-index=0.605). Similar results were observed for ED visits and hospitalizations. The c-indices increased to between 0.65 and 0.70 in adjusted analyses. The kappa for agreement between criteria was 0.80 for the 2003 and 2012 AGS Beers criteria, 0.58 for the 2012 AGS Beers and STOPP criteria, and 0.59 for the 2003 Beers and STOPP criteria. For the three outcomes, the 2012 AGS Beers criteria had the highest sensitivity (61.2-71.2%) and the lowest specificity (41.2-70.7%), and the STOPP criteria had the lowest sensitivity (53.8-64.7%) but the highest specificity (47.8-78.1%).

Conclusion: All three criteria were modestly prognostic for ADEs, EDs, and hospitalizations, with the STOPP criteria slightly outperforming both Beers criteria. With low sensitivity, low specificity, and low agreement between the criteria, they can be used in a complementary fashion to enhance sensitivity in detecting ADEs.

Keywords: Beers criteria; STOPP criteria; adverse drug events; inappropriate prescribing.

Publication types

  • Multicenter Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Aged
  • Aged, 80 and over
  • Drug-Related Side Effects and Adverse Reactions / diagnosis*
  • Drug-Related Side Effects and Adverse Reactions / epidemiology
  • Emergency Service, Hospital*
  • Female
  • Follow-Up Studies
  • Geriatrics
  • Hospitalization / trends*
  • Humans
  • Inappropriate Prescribing / prevention & control*
  • Male
  • Potentially Inappropriate Medication List / statistics & numerical data*
  • Prevalence
  • Retrospective Studies
  • Societies, Medical
  • Time Factors
  • United States / epidemiology