Differential privacy in health research: A scoping review

J Am Med Inform Assoc. 2021 Sep 18;28(10):2269-2276. doi: 10.1093/jamia/ocab135.

Abstract

Objective: Differential privacy is a relatively new method for data privacy that has seen growing use due its strong protections that rely on added noise. This study assesses the extent of its awareness, development, and usage in health research.

Materials and methods: A scoping review was conducted by searching for ["differential privacy" AND "health"] in major health science databases, with additional articles obtained via expert consultation. Relevant articles were classified according to subject area and focus.

Results: A total of 54 articles met the inclusion criteria. Nine articles provided descriptive overviews, 31 focused on algorithm development, 9 presented novel data sharing systems, and 8 discussed appraisals of the privacy-utility tradeoff. The most common areas of health research where differential privacy has been discussed are genomics, neuroimaging studies, and health surveillance with personal devices. Algorithms were most commonly developed for the purposes of data release and predictive modeling. Studies on privacy-utility appraisals have considered economic cost-benefit analysis, low-utility situations, personal attitudes toward sharing health data, and mathematical interpretations of privacy risk.

Discussion: Differential privacy remains at an early stage of development for applications in health research, and accounts of real-world implementations are scant. There are few algorithms for explanatory modeling and statistical inference, particularly with correlated data. Furthermore, diminished accuracy in small datasets is problematic. Some encouraging work has been done on decision making with regard to epsilon. The dissemination of future case studies can inform successful appraisals of privacy and utility.

Conclusions: More development, case studies, and evaluations are needed before differential privacy can see widespread use in health research.

Keywords: confidentiality; data sharing; differential privacy; privacy; statistical disclosure limitation.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Confidentiality*
  • Databases, Factual
  • Genomics
  • Privacy*