Investigating gender-related construct-irrelevant components of scores on the written assessment exercise of a high-stakes certification assessment

Adv Health Sci Educ Theory Pract. 2005;10(1):53-63. doi: 10.1007/s10459-004-4297-y.


The ECFMG Clinical Skills Assessment (CSA) was developed to evaluate whether graduates of international medical schools (IMGs) are ready to enter graduate training programs in the United States. The patient note (PN) exercise is specifically used to assess a candidate's ability to summarize and synthesize the data collected in a simulated patient interview. In a 1-year period, over 7700 first time takers completed the CSA, resulting in over 77,000 physician-based performance ratings. An initial pilot study indicated that, based solely on handwriting, the raters were able to correctly classify the gender of the candidate approximately 70% of the time. This result, combined with the fact that the notes are holistically scored, suggests that rating bias is possible. The purpose of this study was to investigate whether the gender of the candidate, the gender of the performing standardized patient, and the gender of the rater had any impact on scores. An analysis of covariance (ANCOVA) indicated that there was no significant interaction between candidate and rater gender. Female candidates significantly outperformed males, regardless of rater gender (p < 0.01, effect size = 0.23). The results of this study suggest that, based on rater, SP, and candidate characteristics, the validity of the PN ratings is not compromised.

MeSH terms

  • Adult
  • Aged
  • Certification*
  • Cohort Studies
  • Educational Measurement / methods*
  • Female
  • Humans
  • Male
  • Middle Aged
  • Sex Factors*
  • Students, Medical*
  • United States