Profiling Individual Surgeon Performance Using Information from a High-Quality Clinical Registry: Opportunities and Limitations

J Am Coll Surg. 2015 Nov;221(5):901-13. doi: 10.1016/j.jamcollsurg.2015.07.454. Epub 2015 Sep 9.


Background: There is increasing interest in profiling the quality of individual medical providers. Valid assessment of individuals should highlight improvement opportunities, but must be considered in the context of limitations.

Study design: High quality clinical data from the American College of Surgeons NSQIP, gathered in accordance with strict policies and specifications, was used to construct individual surgeon-level assessments. There were 39,976 cases evaluated, performed by 197 surgeons across 9 hospitals. Both 2-level (cases by surgeon) and 3-level (cases by surgeon by hospital) risk-adjusted, hierarchical regression analyses were performed. Outcomes were 30-day postoperative morbidity, surgical site infection, and mortality. Surgeon performance was compared in both absolute and relative terms. "Signal-to-noise" reliability was calculated for surgeons and models. Projected case requirements for reliability levels were generated.

Results: Surgeon performances could be distinguished to different degrees: morbidity distinguished best, mortality least. Outliers could be identified for morbidity and infection, but not mortality. Reliability was also highest for morbidity and lowest for mortality. Even models with high overall reliability did not assess all providers reliably. Incorporating institutional effects had predictable effects: penalizing providers at "good" institutions, benefiting providers at "poor" institutions.

Conclusions: Individual surgeon profiles can, at times, be distinguished with moderate or good reliability, but to different degrees in different models. Absolute and relative comparisons are feasible. Incorporating institutional level effects in individual provider modeling presents an interesting policy dilemma, appearing to benefit providers at "poor-performing" institutions, but penalizing those at "high-performing" ones. No portrayal of individual medical provider quality should be accepted without consideration of modeling rationale and, critically, reliability.

Publication types

  • Evaluation Study

MeSH terms

  • Benchmarking / methods*
  • Clinical Competence / standards*
  • Humans
  • Models, Statistical
  • Postoperative Complications / epidemiology
  • Quality Improvement
  • Quality Indicators, Health Care
  • Registries*
  • Reproducibility of Results
  • Risk Adjustment
  • Surgeons / standards*
  • United States