Combining structured and unstructured data to identify a cohort of ICU patients who received dialysis
- PMID: 24384230
- PMCID: PMC4147606
- DOI: 10.1136/amiajnl-2013-001915
Combining structured and unstructured data to identify a cohort of ICU patients who received dialysis
Abstract
Objective: To develop a generalizable method for identifying patient cohorts from electronic health record (EHR) data-in this case, patients having dialysis-that uses simple information retrieval (IR) tools.
Methods: We used the coded data and clinical notes from the 24,506 adult patients in the Multiparameter Intelligent Monitoring in Intensive Care database to identify patients who had dialysis. We used SQL queries to search the procedure, diagnosis, and coded nursing observations tables based on ICD-9 and local codes. We used a domain-specific search engine to find clinical notes containing terms related to dialysis. We manually validated the available records for a 10% random sample of patients who potentially had dialysis and a random sample of 200 patients who were not identified as having dialysis based on any of the sources.
Results: We identified 1844 patients that potentially had dialysis: 1481 from the three coded sources and 1624 from the clinical notes. Precision for identifying dialysis patients based on available data was estimated to be 78.4% (95% CI 71.9% to 84.2%) and recall was 100% (95% CI 86% to 100%).
Conclusions: Combining structured EHR data with information from clinical notes using simple queries increases the utility of both types of data for cohort identification. Patients identified by more than one source are more likely to meet the inclusion criteria; however, including patients found in any of the sources increases recall. This method is attractive because it is available to researchers with access to EHR data and off-the-shelf IR tools.
Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Figures
Similar articles
-
Evaluation of Clinical Text Segmentation to Facilitate Cohort Retrieval.AMIA Annu Symp Proc. 2018 Apr 16;2017:660-669. eCollection 2017. AMIA Annu Symp Proc. 2018. PMID: 29854131 Free PMC article.
-
A method for cohort selection of cardiovascular disease records from an electronic health record system.Int J Med Inform. 2017 Jun;102:138-149. doi: 10.1016/j.ijmedinf.2017.03.015. Epub 2017 Mar 30. Int J Med Inform. 2017. PMID: 28495342
-
Identifying Cases of Metastatic Prostate Cancer Using Machine Learning on Electronic Health Records.AMIA Annu Symp Proc. 2018 Dec 5;2018:1498-1504. eCollection 2018. AMIA Annu Symp Proc. 2018. PMID: 30815195 Free PMC article.
-
The experiences of adults who are on dialysis and waiting for a renal transplant from a deceased donor: a systematic review.JBI Database System Rev Implement Rep. 2015 Mar 12;13(2):169-211. doi: 10.11124/jbisrir-2015-1973. JBI Database System Rev Implement Rep. 2015. PMID: 26447040 Review.
-
Comparison of electronic health record system functionalities to support the patient recruitment process in clinical trials.Int J Med Inform. 2014 Nov;83(11):860-8. doi: 10.1016/j.ijmedinf.2014.08.005. Epub 2014 Aug 25. Int J Med Inform. 2014. PMID: 25189709 Review.
Cited by
-
Disambiguating Clinical Abbreviations by One-to-All Classification: Algorithm Development and Validation Study.JMIR Med Inform. 2024 Oct 1;12:e56955. doi: 10.2196/56955. JMIR Med Inform. 2024. PMID: 39352715 Free PMC article.
-
Challenges in and Opportunities for Electronic Health Record-Based Data Analysis and Interpretation.Gut Liver. 2024 Mar 15;18(2):201-208. doi: 10.5009/gnl230272. Epub 2023 Oct 31. Gut Liver. 2024. PMID: 37905424 Free PMC article. Review.
-
A case study in applying artificial intelligence-based named entity recognition to develop an automated ophthalmic disease registry.Graefes Arch Clin Exp Ophthalmol. 2023 Nov;261(11):3335-3344. doi: 10.1007/s00417-023-06190-2. Epub 2023 Aug 3. Graefes Arch Clin Exp Ophthalmol. 2023. PMID: 37535181 Free PMC article.
-
Medication based machine learning to identify subpopulations of pediatric hemodialysis patients in an electronic health record database.Inform Med Unlocked. 2022;34:101104. doi: 10.1016/j.imu.2022.101104. Epub 2022 Oct 6. Inform Med Unlocked. 2022. PMID: 36405250 Free PMC article.
-
Topology and redescriptions detect multiple alternative biological pathways from clinical phenotypes.Exp Biol Med (Maywood). 2022 Nov;247(22):2015-2024. doi: 10.1177/15353702221126671. Epub 2022 Nov 18. Exp Biol Med (Maywood). 2022. PMID: 36398440 Free PMC article.
References
-
- Segal JB, Powe NR. Accuracy of identification of patients with immune thrombocytopenic purpura through administrative records: a data validation study. Am J Hematol 2004;75:12–7 - PubMed
-
- Eichler AF, Lamont EB. Utility of administrative claims data for the study of brain metastases: a validation study. J Neurooncol 2009;95:427–31 - PubMed
-
- Zhan C, Elixhauser E, Richards CL, et al. Identification of hospital-acquired catheter-associated urinary tract infections from Medicare claims: sensitivity and positive predictive value. Med Care 2009;47:364–69 - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
