Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012:2012:1450-8.
Epub 2012 Nov 3.

Preserving Institutional Privacy in Distributed binary Logistic Regression

Affiliations

Preserving Institutional Privacy in Distributed binary Logistic Regression

Yuan Wu et al. AMIA Annu Symp Proc. 2012.

Abstract

Privacy is becoming a major concern when sharing biomedical data across institutions. Although methods for protecting privacy of individual patients have been proposed, it is not clear how to protect the institutional privacy, which is many times a critical concern of data custodians. Built upon our previous work, Grid Binary LOgistic REgression (GLORE)1, we developed an Institutional Privacy-preserving Distributed binary Logistic Regression model (IPDLR) that considers both individual and institutional privacy for building a logistic regression model in a distributed manner. We tested our method using both simulated and clinical data, showing how it is possible to protect the privacy of individuals and of institutions using a distributed strategy.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Two types of network for distributed computing. The left panel is the network used for base GLORE model, in which only communications between server and clients are needed. The right panel is the network used for Algorithm 1, in which communications between two clients are also necessary.
Figure 2
Figure 2
The flow chart for Algorithm 2 for ROC curve computation. Suppose there are three local clients with local contingency tables A, B and C. By the secure summation (Algorithm 1), the central server get the overall table A+B+C.
Figure 3
Figure 3
ROC of a 2-site IPDLR for Edinburgh data
Figure 4
Figure 4
ROC of a 2-site IPDLR for CA-(19,125) data

Similar articles

Cited by

References

    1. Wu Y, Jiang X, Kim J, et al. Grid LOgistic REgression (GLORE): Building Shared Models Without Sharing Data. Journal of the American Medical Informatics Association (Accepted) 2012 - PMC - PubMed
    1. Wicks P, Vaughan TE, Massagli MP, et al. Accelerated clinical discovery using self-reported patient data collected online and a patient-matching algorithm. Nature biotechnology. 2011;29:411–414. - PubMed
    1. McGraw D. Privacy and health information technology. The Journal of Law, Medicine & Ethics. 2009;37:121–149. - PubMed
    1. Mohammed N, Fung BCM, Hung PCK, et al. Centralized and Distributed Anonymization for High-Dimensional Healthcare Data. ACM Transactions on Knowledge Discovery from Data. 2010;4(18):1–18. 33.
    1. Agrawal R, Grandison T, Johnson C, et al. Enabling the 21st century health care information technology revolution. Commun ACM. 2007;50:34–42.

Publication types

LinkOut - more resources