Grid Binary LOgistic REgression (GLORE): building shared models without sharing data
- PMID: 22511014
- PMCID: PMC3422844
- DOI: 10.1136/amiajnl-2012-000862
Grid Binary LOgistic REgression (GLORE): building shared models without sharing data
Abstract
Objective: The classification of complex or rare patterns in clinical and genomic data requires the availability of a large, labeled patient set. While methods that operate on large, centralized data sources have been extensively used, little attention has been paid to understanding whether models such as binary logistic regression (LR) can be developed in a distributed manner, allowing researchers to share models without necessarily sharing patient data.
Material and methods: Instead of bringing data to a central repository for computation, we bring computation to the data. The Grid Binary LOgistic REgression (GLORE) model integrates decomposable partial elements or non-privacy sensitive prediction values to obtain model coefficients, the variance-covariance matrix, the goodness-of-fit test statistic, and the area under the receiver operating characteristic (ROC) curve.
Results: We conducted experiments on both simulated and clinically relevant data, and compared the computational costs of GLORE with those of a traditional LR model estimated using the combined data. We showed that our results are the same as those of LR to a 10(-15) precision. In addition, GLORE is computationally efficient.
Limitation: In GLORE, the calculation of coefficient gradients must be synchronized at different sites, which involves some effort to ensure the integrity of communication. Ensuring that the predictors have the same format and meaning across the data sets is necessary.
Conclusion: The results suggest that GLORE performs as well as LR and allows data to remain protected at their original sites.
Conflict of interest statement
Figures
Similar articles
-
Preserving Institutional Privacy in Distributed binary Logistic Regression.AMIA Annu Symp Proc. 2012;2012:1450-8. Epub 2012 Nov 3. AMIA Annu Symp Proc. 2012. PMID: 23304425 Free PMC article.
-
Secure Multi-pArty Computation Grid LOgistic REgression (SMAC-GLORE).BMC Med Inform Decis Mak. 2016 Jul 25;16 Suppl 3(Suppl 3):89. doi: 10.1186/s12911-016-0316-1. BMC Med Inform Decis Mak. 2016. PMID: 27454168 Free PMC article.
-
Grid multi-category response logistic models.BMC Med Inform Decis Mak. 2015 Feb 18;15:10. doi: 10.1186/s12911-015-0133-y. BMC Med Inform Decis Mak. 2015. PMID: 25886151 Free PMC article.
-
VERTIcal Grid lOgistic regression (VERTIGO).J Am Med Inform Assoc. 2016 May;23(3):570-9. doi: 10.1093/jamia/ocv146. Epub 2015 Nov 9. J Am Med Inform Assoc. 2016. PMID: 26554428 Free PMC article.
-
Binary Response Analysis Using Logistic Regression in Dentistry.Int J Dent. 2022 Mar 8;2022:5358602. doi: 10.1155/2022/5358602. eCollection 2022. Int J Dent. 2022. PMID: 35310463 Free PMC article. Review.
Cited by
-
Estimating individualized treatment effects using an individual participant data meta-analysis.BMC Med Res Methodol. 2024 Mar 25;24(1):74. doi: 10.1186/s12874-024-02202-9. BMC Med Res Methodol. 2024. PMID: 38528447 Free PMC article.
-
Learning from vertically distributed data across multiple sites: An efficient privacy-preserving algorithm for Cox proportional hazards model with variable selection.J Biomed Inform. 2024 Jan;149:104581. doi: 10.1016/j.jbi.2023.104581. Epub 2023 Dec 23. J Biomed Inform. 2024. PMID: 38142903
-
Individual Data Protected Integrative Regression Analysis of High-Dimensional Heterogeneous Data.J Am Stat Assoc. 2022;117(540):2105-2119. doi: 10.1080/01621459.2021.1904958. Epub 2021 May 19. J Am Stat Assoc. 2022. PMID: 37975021 Free PMC article.
-
A privacy-preserving and computation-efficient federated algorithm for generalized linear mixed models to analyze correlated electronic health records data.PLoS One. 2023 Jan 17;18(1):e0280192. doi: 10.1371/journal.pone.0280192. eCollection 2023. PLoS One. 2023. PMID: 36649349 Free PMC article.
-
Federated Learning for Sparse Bayesian Models with Applications to Electronic Health Records and Genomics.Pac Symp Biocomput. 2023;28:484-495. Pac Symp Biocomput. 2023. PMID: 36541002 Free PMC article.
References
-
- Willison DJ. Use of data from the electronic health record for health research: current governance challenges and potential approaches. In: Johnston S, Ranford J, eds. OPC Guidance Documents, Annual Reports to Parliament. Ottawa, Ont: Office of the Privacy Commissioner of Canada, 2009:1–32
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
