Social Determinants of Health Documentation in Structured and Unstructured Clinical Data of Patients With Diabetes: Comparative Analysis
- PMID: 37621203
- PMCID: PMC10466443
- DOI: 10.2196/46159
Social Determinants of Health Documentation in Structured and Unstructured Clinical Data of Patients With Diabetes: Comparative Analysis
Abstract
Background: Electronic health records (EHRs) have yet to fully capture social determinants of health (SDOH) due to challenges such as nonexistent or inconsistent data capture tools across clinics, lack of time, and the burden of extra steps for the clinician. However, patient clinical notes (unstructured data) may be a better source of patient-related SDOH information.
Objective: It is unclear how accurately EHR data reflect patients' lived experience of SDOH. The manual process of retrieving SDOH information from clinical notes is time-consuming and not feasible. We leveraged two high-throughput tools to identify SDOH mappings to structured and unstructured patient data: PatientExploreR and Electronic Medical Record Search Engine (EMERSE).
Methods: We included adult patients (≥18 years of age) receiving primary care for their diabetes at the University of California, San Francisco (UCSF), from January 1, 2018, to December 31, 2019. We used expert raters to develop a corpus using SDOH in the compendium as a knowledge base as targets for the natural language processing (NLP) text string mapping to find string stems, roots, and syntactic similarities in the clinical notes of patients with diabetes. We applied advanced built-in EMERSE NLP query parsers implemented with JavaCC.
Results: We included 4283 adult patients receiving primary care for diabetes at UCSF. Our study revealed that SDOH may be more significant in the lives of patients with diabetes than is evident from structured data recorded on EHRs. With the application of EMERSE NLP rules, we uncovered additional information from patient clinical notes on problems related to social connectionsisolation, employment, financial insecurity, housing insecurity, food insecurity, education, and stress.
Conclusions: We discovered more patient information related to SDOH in unstructured data than in structured data. The application of this technique and further investment in similar user-friendly tools and infrastructure to extract SDOH information from unstructured data may help to identify the range of social conditions that influence patients' disease experiences and inform clinical decision-making.
Keywords: EHR; NLP; diabetes; diabetes mellitus; diabetic; electronic health record; free text; machine learning; medical informatics applications; natural language processing; search engine; social determinants of health; text string; unstructured data.
© Shivani Mehta, Courtney R Lyles, Anna D Rubinsky, Kathryn E Kemper, Judith Auerbach, Urmimala Sarkar, Laura Gottlieb, William Brown III. Originally published in JMIR Medical Informatics (https://medinform.jmir.org).
Conflict of interest statement
None declared.
Figures
Similar articles
-
Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7. J Biomed Inform. 2022. PMID: 35007754
-
Extracting social determinants of health from electronic health records using natural language processing: a systematic review.J Am Med Inform Assoc. 2021 Nov 25;28(12):2716-2727. doi: 10.1093/jamia/ocab170. J Am Med Inform Assoc. 2021. PMID: 34613399 Free PMC article.
-
Leveraging natural language processing to augment structured social determinants of health data in the electronic health record.J Am Med Inform Assoc. 2023 Jul 19;30(8):1389-1397. doi: 10.1093/jamia/ocad073. J Am Med Inform Assoc. 2023. PMID: 37130345 Free PMC article.
-
Documentation and review of social determinants of health data in the EHR: measures and associated insights.J Am Med Inform Assoc. 2021 Nov 25;28(12):2608-2616. doi: 10.1093/jamia/ocab194. J Am Med Inform Assoc. 2021. PMID: 34549294 Free PMC article. Review.
-
Realizing the potential of social determinants data in EHR systems: A scoping review of approaches for screening, linkage, extraction, analysis, and interventions.J Clin Transl Sci. 2024 Oct 10;8(1):e147. doi: 10.1017/cts.2024.571. eCollection 2024. J Clin Transl Sci. 2024. PMID: 39478779 Free PMC article. Review.
Cited by
-
Large-scale identification of social and behavioral determinants of health from clinical notes: comparison of Latent Semantic Indexing and Generative Pretrained Transformer (GPT) models.BMC Med Inform Decis Mak. 2024 Oct 10;24(1):296. doi: 10.1186/s12911-024-02705-x. BMC Med Inform Decis Mak. 2024. PMID: 39390479 Free PMC article.
-
On the development and validation of large language model-based classifiers for identifying social determinants of health.Proc Natl Acad Sci U S A. 2024 Sep 24;121(39):e2320716121. doi: 10.1073/pnas.2320716121. Epub 2024 Sep 16. Proc Natl Acad Sci U S A. 2024. PMID: 39284061 Free PMC article.
References
Grants and funding
LinkOut - more resources
Full Text Sources
