Using Machine Learning to Evaluate Attending Feedback on Resident Performance

Anesth Analg. 2021 Feb 1;132(2):545-555. doi: 10.1213/ANE.0000000000005265.

Abstract

Background: High-quality and high-utility feedback allows for the development of improvement plans for trainees. The current manual assessment of the quality of this feedback is time consuming and subjective. We propose the use of machine learning to rapidly distinguish the quality of attending feedback on resident performance.

Methods: Using a preexisting databank of 1925 manually reviewed feedback comments from 4 anesthesiology residency programs, we trained machine learning models to predict whether comments contained 6 predefined feedback traits (actionable, behavior focused, detailed, negative feedback, professionalism/communication, and specific) and predict the utility score of the comment on a scale of 1-5. Comments with ≥4 feedback traits were classified as high-quality and comments with ≥4 utility scores were classified as high-utility; otherwise comments were considered low-quality or low-utility, respectively. We used RapidMiner Studio (RapidMiner, Inc, Boston, MA), a data science platform, to train, validate, and score performance of models.

Results: Models for predicting the presence of feedback traits had accuracies of 74.4%-82.2%. Predictions on utility category were 82.1% accurate, with 89.2% sensitivity, and 89.8% class precision for low-utility predictions. Predictions on quality category were 78.5% accurate, with 86.1% sensitivity, and 85.0% class precision for low-quality predictions. Fifteen to 20 hours were spent by a research assistant with no prior experience in machine learning to become familiar with software, create models, and review performance on predictions made. The program read data, applied models, and generated predictions within minutes. In contrast, a recent manual feedback scoring effort by an author took 15 hours to manually collate and score 200 comments during the course of 2 weeks.

Conclusions: Harnessing the potential of machine learning allows for rapid assessment of attending feedback on resident performance. Using predictive models to rapidly screen for low-quality and low-utility feedback can aid programs in improving feedback provision, both globally and by individual faculty.

Publication types

  • Multicenter Study

MeSH terms

  • Anesthesiologists / education*
  • Anesthesiology / education*
  • Clinical Competence*
  • Data Mining*
  • Databases, Factual
  • Education, Medical, Graduate*
  • Employee Performance Appraisal
  • Formative Feedback*
  • Humans
  • Internship and Residency*
  • Machine Learning*
  • Medical Staff, Hospital*
  • Task Performance and Analysis
  • United States