Objectives: To determine the reliability of scores assigned to interviews of medical students applying to an emergency medicine program.
Methods: A scoring instrument was derived based on faculty and resident input, institutional and national documents, and previous application procedures. Candidates were interviewed by four pairs of interviewers. Interviewers were asked to score the candidates on five visual analog scales (VASs) with objective anchors. Each interview assessed a unique candidate characteristic. All interviewers were given explicit instructions on scoring procedures and instrument use. The data were entered into an Excel database and transferred to SPSS, and reliabilities were measured with a two-way mixed-effect Cronbach's alpha.
Results: Forty applications were received for the 2002 residency entry year. Thirty-eight application packages were complete, and 16 candidates were interviewed. Data collection was complete for all 16. The average measure intraclass correlations for each individual interviewer across the five VASs ranged from 0.72 to 0.92 (mean, 0.85). The interrater reliability within the four interviews (personal characteristics, trainability, suitability for emergency medicine, and suitability for the specific training program) were low at 0.36, 0.59, 0.69, and 0.49. The overall reliability of the four interview scores was 0.83, and for the eight interviewer scores it was 0.86.
Conclusions: The reliability of the overall interview scores was very high. The intraclass correlations for each interviewer's VAS scores were also high, but interrater correlations within interview teams were moderate and not higher than those across interview teams. This study suggests that an interview assessment instrument can be highly reliable overall and that interviewers base scores on an overall global impression.