Objective: To determine how well one state's peer review organization (PRO) judged the quality of hospital care compared with an independent, credible judgment of quality of care.
Design: Retrospective study comparing a PRO's review, including initial screening, physician review, and final judgments, with an independent "study judgment" based on blinded, structured, implicit reviews of hospital records.
Setting: One state's medical and surgical Medicare hospitalizations during 1985 through 1987 audited randomly by the state's PRO.
Sample: Stratified random sampling of records: 62 records that passed the PRO initial screening process and were not referred for PRO physician review; 50 records that failed the PRO screen and then were confirmed by PRO physicians to be "quality problems."
Main outcome measure: A study judgment of below standard or standard or above based on the mean of overall ratings by five internists for records in medical diagnosis related groups (DRGs) and by five internists and five surgeons for surgical DRGs. Each step in the PRO review was evaluated for how many records passing or failing that step were judged standard or above or below standard in the study (positive and negative predictive value) and how well that step classified records that the study judged below standard or standard or above (sensitivity and specificity).
Results: An estimated 18% of records reviewed by the PRO were below standard according to the study judgment, compared with 6.3% quality problems according to the PRO's final judgment (difference, 12%; 95% confidence interval, 1 to 23). The PRO's initial screening process failed to detect and refer for PRO physician review two of three records that the study judged below standard. In addition, only one of three of the records that PRO physicians judged to be quality problems were judged below standard by the study judgment. Therefore, the PRO's final quality of care judgment and the study judgment agreed little more than expected by chance, especially about poor quality of care. Although the PRO correctly classified 95% of the records that the study judged standard or above, it detected only 11% of records judged below standard by the study.
Conclusions: Most of all, this PRO review process would be improved by additional preliminary screens to identify the 67% of records that the study judged below standard but that passed its initial screening. The screening process also must be more accurate in order to be cost-effective, as it was only slightly better than random sampling at correctly identifying below standard care. More reproducible physician review is also needed and might be accomplished through improved reviewer selection and training, a structured review method, and more physician reviewers per record.