How well do faculty evaluate the interviewing skills of medical students?

J Gen Intern Med. 1992 Sep-Oct;7(5):499-505. doi: 10.1007/BF02599452.


Objective: To study the reliability and validity of using medical school faculty in the evaluation of the interviewing skills of medical students.

Design: All second-year University of North Carolina medical students (n = 159) were observed interviewing standardized patients for 5 minutes by one of eight experienced clinical faculty. Interview quality was assessed by a faculty checklist covering questioning style, facilitative behaviors, and specific content. Twenty-one randomly chosen students were videotaped and rated: by the original rater as well as four other raters; by two nationally recognized experts; and according to Roter's coding dimensions, which have been found to correlate strongly with patient compliance and satisfaction.

Setting: Medical school at a state university in the southeastern United States.

Participants: Faculty members who volunteered to evaluate second-year medical students during an annual Objective Structured Clinical Exam.

Interventions: Interrater reliability and intrarater reliability were tested using videotapes of medical students interviewing a standardized patient. Validity was tested by comparing the faculty judgment with both an analysis using the Roter Interactional Analysis System and an assessment made by expert interviewers.

Measurements and main results: Faculty mean checklist score was 80% (range 41-100%). Intrarater reliability was poor for assessment of skills and behaviors as compared with that for content obtained. Interrater reliability was also poor as measured by intraclass correlation coefficients ranging from 0.11 to 0.37. When compared with the experts, faculty raters had a sensitivity of 80% but a specificity of 45% in identifying students with adequate skills. The predictive value of faculty assessment was 12%. Analysis using Roter's coding scheme suggests that faculty scored students on the basis of likability rather than specific behavioral skills, limiting their ability to provide behaviorally specific feedback.

Conclusions: To accurately evaluate clinical interviewing skills we must enhance rater consistency, particularly in assessing those skills that both satisfy patients and yield crucial data.

MeSH terms

  • Clinical Competence*
  • Educational Measurement / standards*
  • Faculty, Medical*
  • Humans
  • Interviews as Topic*
  • Medical History Taking
  • North Carolina
  • Observer Variation
  • Predictive Value of Tests
  • Schools, Medical
  • Students, Medical*