An instrument designed for faculty supervision evaluation by anesthesia residents and its psychometric properties

Anesth Analg. 2008 Oct;107(4):1316-22. doi: 10.1213/ane.0b013e318182fbdd.

Abstract

Background and objectives: We aimed 1) to develop a valid and reliable instrument for faculty supervision evaluation by anesthesia residents and 2) to disclose the sources of error in residents' ratings.

Methods: A qualitative study involving residents and faculty identified constructs of supervisory ability, which were entered as items in a measurement instrument used by 19 residents to evaluate 39 instructors during a 6-mo period. The instrument was psychometrically tested under classical item and generalizability theories. A decision study, using the parameters of the generalizability (G) study, estimated the number of resident ratings needed to produce dependable measures of a single faculty.

Results: Nine dimensions emerged from the qualitative study: planning perianesthesia care, providing feedback ("the instructor provides me timely, informal, non-threatening comments on my performance and shows me ways to improve"); being available ("the instructor is promptly available to help me solve problems with patients and procedures"); giving opportunities/fostering resident autonomy; stimulating patient-based learning; demonstrating professionalism; being present during the critical events; demonstrating interpersonal skills; being concerned about safety. Residents provided 970 evaluations. The instrument exhibited internal consistency (Cronbach's alpha=0.93), content and face validities, and a single-factor structure. Generalizability and dependability coefficients were 0.93. Between-instructors differences accounted for 56% of score variance. Resident-instructor interactions accounted for 44% of score variance, indicating that scores were influenced by each resident's unique perceptions of instructors (halo effect). According to the results of the decision study, dependability of measures within the 75% to 95% range could be expected with 3 to 33 residents rating each faculty member, respectively.

Conclusions: The nine-item instrument produced valid and reliable measures of faculty supervision. However, a significant amount of halo effect biased such measures. G-studies may help identify the type and magnitude of rater biases affecting resident-generated faculty supervision evaluations, and can be useful for interpreting their results, especially if personnel decisions (e.g., tenure, promotion) rely on such measures.

MeSH terms

  • Anesthesiology / education*
  • Evaluation Studies as Topic
  • Faculty, Medical*
  • Humans
  • Internship and Residency*
  • Psychometrics
  • Surveys and Questionnaires
  • Teaching*