Gender Differences in Milestone Ratings and Medical Knowledge Examination Scores Among Internal Medicine Residents

Acad Med. 2021 Mar 11. doi: 10.1097/ACM.0000000000004040. Online ahead of print.


Purpose: To examine whether there are group differences in milestone ratings submitted by program directors working with clinical competency committees (CCCs) based on gender for internal medicine (IM) residents and whether women and men rated similarly on milestones perform comparably on subsequent in-training and certification examinations.

Method: This national retrospective study examined end-of-year medical knowledge (MK) and patient care (PC) milestone ratings and IM In-Training Examination (IM-ITE) and IM Certification Examination (IM-CE) scores for 2 cohorts (2014-2017, 2015-2018) of U.S. IM residents at ACGME-accredited programs. It included 20,098/21,440 (94%) residents, with 9,424 women (47%) and 10,674 men (53%). Descriptive statistics and differential prediction techniques using hierarchical linear models were performed.

Results: For MK milestone ratings in PGY-1, men and women showed no statistical difference at a significance level of .01 (P = .02). In PGY-2 and PGY-3, men received statistically higher average MK ratings than women (P = .002 and P < .001, respectively). In contrast, men and women received equivalent average PC ratings in each PGY (P = .47, P = .72, and P = .80, for PGY-1, PGY-2, and PGY-3, respectively). Men slightly outperformed women with similar MK or PC ratings in PGY-1 and PGY-2 on the IM-ITE by about 1.7 and 1.5 percentage points, respectively, after adjusting for covariates. For PGY-3 ratings, women and men with similar milestone ratings performed equivalently on the IM-CE.

Conclusions: Milestone ratings were largely similar for women and men. Generally, women and men with similar MK or PC milestone ratings performed similarly on future examinations. Although there were small differences favoring men on earlier examinations, these differences disappeared by the final training year. It is questionable whether these small differences are educationally or clinically meaningful. The findings suggest fair, unbiased milestones ratings generated by program directors and CCCs assessing residents.