Background: Hospitals are increasingly being evaluated with respect to the quality of provided care. In this setting, several indicator sets compete with one another for the assessment of effectiveness and safety. However, there have been few comparative investigations covering different sets. The objective of this study was to answer three questions: How concordant are different indicator sets on a hospital level? What is the effect of applying different reference values? How stable are the positions of a hospital ranking?
Methods: Routine data were made available to three companies offering the Patient Safety Indicators, an indicator set from the HELIOS Hospital Group, and measurements based on Disease Staging™. Ten hospitals from North Rhine-Westphalia, comprising a total of 151,960 inpatients in 2006, volunteered to participate in this study. The companies provided standard quality reports for the ten hospitals. Composite measures were defined for strengths and weaknesses. In addition to the different indicator sets, different reference values for one set allowed the construction of several comparison groups. Concordance and robustness were analyzed using the non-parametric correlation coefficient and Kendall's W.
Results: Indicator sets differing only in the reference values of the indicators showed significant correlations in most of the pairs with respect to weaknesses (maximum r = 0.927, CI 0.714-0.983, p < 0.001). There were also significant correlations between different sets (maximum r = 0.829, CI 0.417-0.958, p = 0.003) having different indicators or when different methods for performance assessment were applied. The results were weaker measuring hospital strengths (maximum r = 0.669, CI 0.068-0.914, p = 0.034). In a hospital ranking, only two hospitals belonged consistently either to the superior or to the inferior half of the group. Even altering reference values or the supplier for the same indicator set changed the rank for nine out of ten hospitals.
Conclusions: Our results reveal an unsettling lack of concordance in estimates of hospital performance when different quality indicator sets are used. These findings underline the lack of consensus regarding optimal validated measures for judging hospital quality. The indicator sets shared a common definition of quality, independent of their focus on patient safety, mortality, or length of stay. However, for most of the hospitals, changing the indicator set or the reference value resulted in a shift from the superior to the inferior half of the group or vice versa. Thus, while taken together the indicator sets offer the hospitals complementary pictures of their quality, on an individual basis they do not establish a reliable ranking.