Background: Tools for the evaluation, improvement and promotion of the teaching excellence of faculty remain elusive in residency settings. This study investigates (i) the reliability and validity of the data yielded by using two new instruments for evaluating the teaching qualities of medical faculty, (ii) the instruments' potential for differentiating between faculty, and (iii) the number of residents' evaluations needed per faculty to reliably use the instruments.
Methods and materials: Multicenter cross-sectional survey among 546 residents and 629 medical faculty representing 29 medical (non-surgical) specialty training programs in The Netherlands. Two instruments--one completed by residents and one by faculty--for measuring teaching qualities of faculty were developed. Statistical analyses included factor analysis, reliability and validity exploration using standard psychometric methods, calculation of the numbers of residents' evaluations needed per faculty to achieve reliable assessments and variance components and threshold analyses.
Results: A total of 403 (73.8%) residents completed 3575 evaluations of 570 medical faculty while 494 (78.5%) faculty self-evaluated. In both instruments five composite-scales of faculty teaching qualities were detected with high internal consistency and reliability: learning climate (Cronbach's alpha of 0.85 for residents' instrument, 0.71 for self-evaluation instrument, professional attitude and behavior (0.84/0.75), communication of goals (0.90/0.84), evaluation of residents (0.91/0.81), and feedback (0.91/0.85). Faculty tended to evaluate themselves higher than did the residents. Up to a third of the total variance in various teaching qualities can be attributed to between-faculty differences. Some seven residents' evaluations per faculty are needed for assessments to attain a reliability level of 0.90.
Conclusions: The instruments for evaluating teaching qualities of medical faculty appear to yield reliable and valid data. They are feasible for use in medical residencies, can detect between-faculty differences and supply potentially useful information for improving graduate medical education.