Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Dec;14(12):833-8.

Benchmarking physician performance: reliability of individual and composite measures

Affiliations

Benchmarking physician performance: reliability of individual and composite measures

Sarah Hudson Scholle et al. Am J Manag Care. 2008 Dec.

Abstract

Objective: To examine the reliability of quality measures to assess physician performance, which are increasingly used as the basis for quality improvement efforts, contracting decisions, and financial incentives, despite concerns about the methodological challenges.

Study design: Evaluation of health plan administrative claims and enrollment data.

Methods: The study used administrative data from 9 health plans representing more than 11 million patients. The number of quality events (patients eligible for a quality measure), mean performance, and reliability estimates were calculated for 27 quality measures. Composite scores for preventive, chronic, acute, and overall care were calculated as the weighted mean of the standardized scores. Reliability was estimated by calculating the physician-to-physician variance divided by the sum of the physician-to-physician variance plus the measurement variance, and 0.70 was considered adequate.

Results: Ten quality measures had reliability estimates above 0.70 at a minimum of 50 quality events. For other quality measures, reliability was low even when physicians had 50 quality events. The largest proportion of physicians who could be reliably evaluated on a single quality measure was 8% for colorectal cancer screening and 2% for nephropathy screening among patients with diabetes mellitus. More physicians could be reliably evaluated using composite scores (<17% for preventive care, >7% for chronic care, and 15%-20% for an overall composite).

Conclusions: In typical health plan administrative data, most physicians do not have adequate numbers of quality events to support reliable quality measurement. The reliability of quality measures should be taken into account when quality information is used for public reporting and accountability. Efforts to improve data available for physician profiling are also needed.

PubMed Disclaimer

Figures

Figure
Figure
Reliability of Preventive Care Composite Measure by the Number of Quality Events Per Physician (Plan C)

Similar articles

Cited by

References

    1. Baker G, Carter B. Provider Pay-for-Performance Incentive Programs: 2004 National Study Results. Med-Vantage, Inc; San Francisco, CA: 2005.
    1. Galvin R, Milstein A. Large employers' new strategies in health care. N Engl J Med. 2002;347(12):939–942. - PubMed
    1. Greenfield S, Kaplan SH, Kahn R, Ninomiya J, Griffith JL. Profiling care provided by different groups of physicians: effects of patient case-mix (bias) and physician-level clustering on quality assessment results. Ann Intern Med. 2002;136(2):111–121. - PubMed
    1. Tucker JL., III The theory and methodology of provider profiling. Int J Health Care Qual Assur Inc Leadersh Health Serv. 2000;13(67):316–321. - PubMed
    1. Krein SL, Hofer TP, Kerr EA, Hayward RA. Whom should we profile? examining diabetes care practice variation among primary care providers, provider groups, and health care facilities. Health Serv Res. 2002;37(5):1159–1189. - PMC - PubMed

Publication types