Discrepancies in Race and Ethnicity in the Electronic Health Record Compared to Self-report

J Racial Ethn Health Disparities. 2022 Nov 23. doi: 10.1007/s40615-022-01445-w. Online ahead of print.


Objective: Racial and ethnic disparities are commonplace in health care. Research often relies on sociodemographic information recorded in the electronic health record (EHR). Little evidence is available about the accuracy of EHR-recorded sociodemographic information, and none in pediatrics. Our objective was to determine the accuracy of EHR-recorded race and ethnicity compared to self-report.

Methods: Patients/guardians enrolled in two prospective observational studies (10/2014-1/2019) provided self-reported sociodemographic information. Corresponding EHR information was abstracted. EHR information was compared to self-report, considered "gold standard." Agreement was evaluated with Cohen's kappa.

Results: A total of 503 patients (42% female, median age 12.8 years) were identified. Self-reported race (N = 484) was 73% White, 16% Black or African American (AA), 4% Asian, 5% multiracial, and 2% other. Self-reported ethnicity (N = 410) was 9% Hispanic/Latino, and 88% non-Hispanic/Latino. Agreement between self-reported and EHR-recorded race was substantial (kappa = 0.77, 95% CI 0.72-0.83). Race was discordant among 10% (47/476). Hispanic/Latino ethnicity also had strong agreement (kappa = 0.77, 95% CI 0.65-0.89). Among those who self-reported Hispanic/Latino and reported race (N = 21), race was less accurately recorded in the EHR (kappa = 0.26, 95% CI 0-0.54). Race did not match among 43% with recorded race (9/21). Among self-reported racial and/or ethnic minorities, 13% (12/164) were misclassified in the EHR as non-Hispanic White.

Conclusions: We found race and ethnicity are often inaccurately recorded in the EHR for patients who self-identify as minorities, leading to under-representation of minorities in the EHR. Inaccurately recorded race and ethnicity has important implications for disparity research, and for informing health policy. Reliable processes are needed to incorporate self-reported race and ethnicity in the EHR at institutional and national levels.

Keywords: Data quality; Disparities; Electronic health record; Ethnicity; Race.