SHARE: system design and case studies for statistical health information release
- PMID: 23059729
- PMCID: PMC3555328
- DOI: 10.1136/amiajnl-2012-001032
SHARE: system design and case studies for statistical health information release
Abstract
Objectives: We present SHARE, a new system for statistical health information release with differential privacy. We present two case studies that evaluate the software on real medical datasets and demonstrate the feasibility and utility of applying the differential privacy framework on biomedical data.
Materials and methods: SHARE releases statistical information in electronic health records with differential privacy, a strong privacy framework for statistical data release. It includes a number of state-of-the-art methods for releasing multidimensional histograms and longitudinal patterns. We performed a variety of experiments on two real datasets, the surveillance, epidemiology and end results (SEER) breast cancer dataset and the Emory electronic medical record (EeMR) dataset, to demonstrate the feasibility and utility of SHARE.
Results: Experimental results indicate that SHARE can deal with heterogeneous data present in medical data, and that the released statistics are useful. The Kullback-Leibler divergence between the released multidimensional histograms and the original data distribution is below 0.5 and 0.01 for seven-dimensional and three-dimensional data cubes generated from the SEER dataset, respectively. The relative error for longitudinal pattern queries on the EeMR dataset varies between 0 and 0.3. While the results are promising, they also suggest that challenges remain in applying statistical data release using the differential privacy framework for higher dimensional data.
Conclusions: SHARE is one of the first systems to provide a mechanism for custodians to release differentially private aggregate statistics for a variety of use cases in the medical domain. This proof-of-concept system is intended to be applied to large-scale medical data warehouses.
Figures
Similar articles
-
Privacy-preserving biomedical data dissemination via a hybrid approach.AMIA Annu Symp Proc. 2018 Dec 5;2018:1176-1185. eCollection 2018. AMIA Annu Symp Proc. 2018. PMID: 30815160 Free PMC article.
-
Secure and scalable deduplication of horizontally partitioned health data for privacy-preserving distributed statistical computation.BMC Med Inform Decis Mak. 2017 Jan 3;17(1):1. doi: 10.1186/s12911-016-0389-x. BMC Med Inform Decis Mak. 2017. PMID: 28049465 Free PMC article.
-
Protecting count queries in study design.J Am Med Inform Assoc. 2012 Sep-Oct;19(5):750-7. doi: 10.1136/amiajnl-2011-000459. Epub 2012 Apr 17. J Am Med Inform Assoc. 2012. PMID: 22511018 Free PMC article.
-
New Methods to Protect Privacy When Using Patient Health Data to Compare Treatments [Internet].Washington (DC): Patient-Centered Outcomes Research Institute (PCORI); 2021 Feb. Washington (DC): Patient-Centered Outcomes Research Institute (PCORI); 2021 Feb. PMID: 38232192 Free Books & Documents. Review.
-
A Blockchain Framework for Patient-Centered Health Records and Exchange (HealthChain): Evaluation and Proof-of-Concept Study.J Med Internet Res. 2019 Aug 31;21(8):e13592. doi: 10.2196/13592. J Med Internet Res. 2019. PMID: 31471959 Free PMC article. Review.
Cited by
-
Why don't we share data and code? Perceived barriers and benefits to public archiving practices.Proc Biol Sci. 2022 Nov 30;289(1987):20221113. doi: 10.1098/rspb.2022.1113. Epub 2022 Nov 23. Proc Biol Sci. 2022. PMID: 36416041 Free PMC article.
-
Differential privacy in health research: A scoping review.J Am Med Inform Assoc. 2021 Sep 18;28(10):2269-2276. doi: 10.1093/jamia/ocab135. J Am Med Inform Assoc. 2021. PMID: 34333623 Free PMC article. Review.
-
EXpectation Propagation LOgistic REgRession on permissioned blockCHAIN (ExplorerChain): decentralized online healthcare/genomics predictive model learning.J Am Med Inform Assoc. 2020 May 1;27(5):747-756. doi: 10.1093/jamia/ocaa023. J Am Med Inform Assoc. 2020. PMID: 32364235 Free PMC article.
-
Privacy Policy and Technology in Biomedical Data Science.Annu Rev Biomed Data Sci. 2018 Jul;1:115-129. doi: 10.1146/annurev-biodatasci-080917-013416. Annu Rev Biomed Data Sci. 2018. PMID: 31058261 Free PMC article.
-
A multi-institution evaluation of clinical profile anonymization.J Am Med Inform Assoc. 2016 Apr;23(e1):e131-7. doi: 10.1093/jamia/ocv154. Epub 2015 Nov 13. J Am Med Inform Assoc. 2016. PMID: 26567325 Free PMC article.
References
-
- Advisory C for USPIT, PITAC, (PITAC) President's Information Technology Advisory Committee. Revolutionizing health care through information technology. National Coordination Office for Information Technology Research and Development, 2004
-
- Stead WW, Lin HS, eds. Computational technology for effective health care: immediate steps and strategic directions. Committee on Engaging the Computer Science Research Community in Health Care Informatics; National Research Council. Washington DC: The National Academies Press, 2009 - PubMed
-
- Nass SJ, Levit LA, Gostin LO. Beyond the HIPAA privacy rule: enhancing privacy, improving health through research. Washington DC: National Academy Press, 2009 - PubMed
-
- Fung BCM, Wang K, Chen R, et al. Privacy-preserving data publishing: a survey of recent developments. ACM Computing Surveys 2010;42:1–534
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
