Estimating Patient's Health State Using Latent Structure Inferred from Clinical Time Series and Text

IEEE EMBS Int Conf Biomed Health Inform. 2017 Feb:2017:449-452. doi: 10.1109/BHI.2017.7897302. Epub 2017 Apr 13.

Abstract

Modern intensive care units (ICUs) collect large volumes of data in monitoring critically ill patients. Clinicians in the ICUs face the challenge of interpreting large volumes of high-dimensional data to diagnose and treat patients. In this work, we explore the use of Hierarchical Dirichlet Processes (HDP) as a Bayesian nonparametric framework to infer patients' states of health by combining multiple sources of data. In particular, we employ HDP to combine clinical time series and text from the nursing progress notes in a probabilistic topic modeling framework for patient risk stratification. Given a patient cohort, we use HDP to infer latent "topics" shared across multimodal patient data from the entire cohort. Each topic is modeled as a multinomial distribution over a vocabulary of codewords, defined over heterogeneous data sources. We evaluate the clinical utility of the learned topic structure using the first 24-hour ICU data from over 17,000 adult patients in the MIMIC-II database to estimate patients' risks of in-hospital mortality. Our results demonstrate that our approach provides a viable framework for combining different data modalities to model patient's states of health, and can potentially be used to generate alerts to identify patients at high risk of hospital mortality.