Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform
- PMID: 30964441
- PMCID: PMC6477571
- DOI: 10.2196/13043
Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform
Abstract
Background: Health care data are increasing in volume and complexity. Storing and analyzing these data to implement precision medicine initiatives and data-driven research has exceeded the capabilities of traditional computer systems. Modern big data platforms must be adapted to the specific demands of health care and designed for scalability and growth.
Objective: The objectives of our study were to (1) demonstrate the implementation of a data science platform built on open source technology within a large, academic health care system and (2) describe 2 computational health care applications built on such a platform.
Methods: We deployed a data science platform based on several open source technologies to support real-time, big data workloads. We developed data-acquisition workflows for Apache Storm and NiFi in Java and Python to capture patient monitoring and laboratory data for downstream analytics.
Results: Emerging data management approaches, along with open source technologies such as Hadoop, can be used to create integrated data lakes to store large, real-time datasets. This infrastructure also provides a robust analytics platform where health care and biomedical research data can be analyzed in near real time for precision medicine and computational health care use cases.
Conclusions: The implementation and use of integrated data science platforms offer organizations the opportunity to combine traditional datasets, including data from the electronic health record, with emerging big data sources, such as continuous patient monitoring and real-time laboratory results. These platforms can enable cost-effective and scalable analytics for the information that will be key to the delivery of precision medicine initiatives. Organizations that can take advantage of the technical advances found in data science platforms will have the opportunity to provide comprehensive access to health care data for computational health care and precision medicine research.
Keywords: big data; computational health care; data science; medical informatics computing; monitoring, physiologic.
©Jacob McPadden, Thomas JS Durant, Dustin R Bunch, Andreas Coppi, Nathaniel Price, Kris Rodgerson, Charles J Torre Jr, William Byron, Allen L Hsiao, Harlan M Krumholz, Wade L Schulz. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 09.04.2019.
Conflict of interest statement
Conflicts of Interest: HMK was a recipient of a research grant, through Yale, from Medtronic and the US Food and Drug Administration to develop methods for postmarket surveillance of medical devices; is a recipient of research agreements with Medtronic and Johnson & Johnson (Janssen), through Yale, to develop methods of clinical trial data sharing; works under contract with the US Centers for Medicare & Medicaid Services to develop and maintain performance measures that are publicly reported; chairs a Cardiac Scientific Advisory Board for UnitedHealth Group Inc; is a participant and participant representative of the IBM Watson Health Life Sciences Board; is a member of the Advisory Board for Element Science, Inc, and the Physician Advisory Board for Aetna Inc; and is the founder of Hugo, a personal health information platform. WLS is a consultant for Hugo, a personal health information platform.
Figures
Similar articles
-
Innovations in Genomics and Big Data Analytics for Personalized Medicine and Health Care: A Review.Int J Mol Sci. 2022 Apr 22;23(9):4645. doi: 10.3390/ijms23094645. Int J Mol Sci. 2022. PMID: 35563034 Free PMC article. Review.
-
A scalable, secure, and interoperable platform for deep data-driven health management.Nat Commun. 2021 Oct 1;12(1):5757. doi: 10.1038/s41467-021-26040-1. Nat Commun. 2021. PMID: 34599181 Free PMC article.
-
Integrative methods for analyzing big data in precision medicine.Proteomics. 2016 Mar;16(5):741-58. doi: 10.1002/pmic.201500396. Proteomics. 2016. PMID: 26677817 Review.
-
Applications of Artificial Intelligence and Big Data Analytics in m-Health: A Healthcare System Perspective.J Healthc Eng. 2020 Aug 30;2020:8894694. doi: 10.1155/2020/8894694. eCollection 2020. J Healthc Eng. 2020. PMID: 32952992 Free PMC article. Review.
-
A logic model for precision medicine implementation informed by stakeholder views and implementation science.Genet Med. 2019 May;21(5):1139-1154. doi: 10.1038/s41436-018-0315-y. Epub 2018 Oct 23. Genet Med. 2019. PMID: 30353149
Cited by
-
Use of Electronic Health Records to Characterize Patients with Uncontrolled Hypertension in Two Large Health System Networks.Res Sq [Preprint]. 2024 Feb 15:rs.3.rs-3943912. doi: 10.21203/rs.3.rs-3943912/v1. Res Sq. 2024. PMID: 38410433 Free PMC article. Preprint.
-
Use of Electronic Health Records to Characterize Patients with Uncontrolled Hypertension in Two Large Health System Networks.medRxiv [Preprint]. 2023 Jul 28:2023.07.26.23293225. doi: 10.1101/2023.07.26.23293225. medRxiv. 2023. PMID: 37546792 Free PMC article. Preprint.
-
Personalised Medicine-Implementation to the Healthcare System in Europe (Focus Group Discussions).J Pers Med. 2023 Feb 21;13(3):380. doi: 10.3390/jpm13030380. J Pers Med. 2023. PMID: 36983562 Free PMC article.
-
A Case Study of Enhancing the Data Science Capacity of an RCMI Program at a Historically Black Medical College.Int J Environ Res Public Health. 2023 Mar 8;20(6):4775. doi: 10.3390/ijerph20064775. Int J Environ Res Public Health. 2023. PMID: 36981686 Free PMC article.
-
Design and Development of a Big Data Platform for Disease Burden Based on the Spark Engine.Comput Intell Neurosci. 2023 Feb 6;2023:8963053. doi: 10.1155/2023/8963053. eCollection 2023. Comput Intell Neurosci. 2023. PMID: 36793705 Free PMC article.
References
-
- EMC . The digital universe driving data growth in healthcare. Hopkinton, MA: Dell Inc; 2014. [2018-10-03]. https://www.emc.com/analyst-report/digital-universe-healthcare-vertical-... .
-
- Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Hung Byers A. Big data: the next frontier for innovation, competition, and productivity. New York, NY: McKinsey Global Institute; 2011. Jun, [2019-02-04]. https://www.mckinsey.com/~/media/McKinsey/Business%20Functions/McKinsey%... .
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
