Differential Privacy Protections in 2020 U.S. Decennial Census Data Do Not Impede Measurement of Racial and Ethnic Disparities

Med Care Res Rev. 2024 May 14:10775587241251870. doi: 10.1177/10775587241251870. Online ahead of print.

Abstract

Census data are vital to health care research but must also protect respondents' confidentiality. The 2020 decennial Census employs a new Differential Privacy framework; this study examines its effect on the accuracy of an important tool for measuring health disparities, the Bayesian Improved Surname and Geocoding (BISG) algorithm, which uses Census Block Group data to estimate race and ethnicity when self-reported data are unavailable. Using self-reported race and ethnicity data as our standard, we compared the accuracy of BISG estimates calculated using the original 2010 Census counts to the accuracy of estimates calculated using 2010 data but with 2020 Differential Privacy in place. The Differential Privacy methodology slightly decreases BISG accuracy for American Indian and Alaska Native people but has little effect for other groups, suggesting that the methodology will not impede health disparities research that employs BISG and similar methods.

Keywords: Bayesian improved surname and geocoding; decennial census; differential privacy; health disparities; race and ethnicity.