Background: Emergency department (ED) overcrowding has become a frequent topic of investigation. Despite a significant body of research, there is no standard definition or measurement of ED crowding. Four quantitative scales for ED crowding have been proposed in the literature: the Real-time Emergency Analysis of Demand Indicators (READI), the Emergency Department Work Index (EDWIN), the National Emergency Department Overcrowding Study (NEDOCS) scale, and the Emergency Department Crowding Scale (EDCS). These four scales have yet to be independently evaluated and compared.
Objectives: The goals of this study were to formally compare four existing quantitative ED crowding scales by measuring their ability to detect instances of perceived ED crowding and to determine whether any of these scales provide a generalizable solution for measuring ED crowding.
Methods: Data were collected at two-hour intervals over 135 consecutive sampling instances. Physician and nurse agreement was assessed using weighted kappa statistics. The crowding scales were compared via correlation statistics and their ability to predict perceived instances of ED crowding. Sensitivity, specificity, and positive predictive values were calculated at site-specific cut points and at the recommended thresholds.
Results: All four of the crowding scales were significantly correlated, but their predictive abilities varied widely. NEDOCS had the highest area under the receiver operating characteristic curve (AROC) (0.92), while EDCS had the lowest (0.64). The recommended thresholds for the crowding scales were rarely exceeded; therefore, the scales were adjusted to site-specific cut points. At a site-specific cut point of 37.19, NEDOCS had the highest sensitivity (0.81), specificity (0.87), and positive predictive value (0.62).
Conclusions: At the study site, the suggested thresholds of the published crowding scales did not agree with providers' perceptions of ED crowding. Even after adjusting the scales to site-specific thresholds, a relatively low prevalence of ED crowding resulted in unacceptably low positive predictive values for each scale. These results indicate that these crowding scales lack scalability and do not perform as designed in EDs where crowding is not the norm. However, two of the crowding scales, EDWIN and NEDOCS, and one of the READI subscales, bed ratio, yielded good predictive power (AROC >0.80) of perceived ED crowding, suggesting that they could be used effectively after a period of site-specific calibration at EDs where crowding is a frequent occurrence.