The reliability of AHRQ Common Format Harm Scales in rating patient safety events

J Patient Saf. 2015 Mar;11(1):52-9. doi: 10.1097/PTS.0b013e3182948ef9.


Objectives: A study was conducted to determine the reliability of Agency for Healthcare Research & Quality (AHRQ) Common Format Harm Scale versions 1.1 and 1.2 in rating patient safety events among users of the UHC Patient Safety Net, a Web-based incident reporting tool.

Methods: To test interrater agreement, UHC developed a survey tool consisting of patient event scenarios. In 2011, a survey evaluating Harm Scale v.1.1 was distributed to 921 quality, risk, and safety (QRS) managers at 89 organizations; in 2012, a second survey evaluating Harm Scale v.1.2 was sent to 13,280 managers at 102 organizations.

Results: Regardless of the version used, in 3 of 9 scenarios, fewer than 60% of respondents agreed on a single score. Interrater agreement increased for certain event scenarios with v.1.2 but decreased for other scenarios. Interrater reliability was moderate for both v.1.1 (k = 0.51) and v.1.2 (k = 0.47). Interrater agreement improved in v.1.2 when results were limited to more experienced raters but still remained in the moderate range (k = 0.58).

Conclusions: AHRQ Common Format Harm Scale v.1.1 and v.1.2 both had moderate interrater reliability. Using Harm Scale v.1.1, respondents had difficulty distinguishing "injury limited to additional treatment" from "temporary harm," whereas, using Harm Scale v.1.2, respondents had difficulty distinguishing moderate harm from one of the adjacent levels-mild or severe harm. This study provides valuable data that can inform harm scale revision to improve the quality of aggregate safety data used to define and direct safety efforts.

MeSH terms

  • Data Collection
  • Humans
  • Patient Safety*
  • Reproducibility of Results
  • Risk Management*
  • Safety Management*
  • United States
  • United States Agency for Healthcare Research and Quality