Personalized mortality prediction driven by electronic medical data and a patient similarity metric

PLoS One. 2015 May 15;10(5):e0127428. doi: 10.1371/journal.pone.0127428. eCollection 2015.

Abstract

Background: Clinical outcome prediction normally employs static, one-size-fits-all models that perform well for the average patient but are sub-optimal for individual patients with unique characteristics. In the era of digital healthcare, it is feasible to dynamically personalize decision support by identifying and analyzing similar past patients, in a way that is analogous to personalized product recommendation in e-commerce. Our objectives were: 1) to prove that analyzing only similar patients leads to better outcome prediction performance than analyzing all available patients, and 2) to characterize the trade-off between training data size and the degree of similarity between the training data and the index patient for whom prediction is to be made.

Methods and findings: We deployed a cosine-similarity-based patient similarity metric (PSM) to an intensive care unit (ICU) database to identify patients that are most similar to each patient and subsequently to custom-build 30-day mortality prediction models. Rich clinical and administrative data from the first day in the ICU from 17,152 adult ICU admissions were analyzed. The results confirmed that using data from only a small subset of most similar patients for training improves predictive performance in comparison with using data from all available patients. The results also showed that when too few similar patients are used for training, predictive performance degrades due to the effects of small sample sizes. Our PSM-based approach outperformed well-known ICU severity of illness scores. Although the improved prediction performance is achieved at the cost of increased computational burden, Big Data technologies can help realize personalized data-driven decision support at the point of care.

Conclusions: The present study provides crucial empirical evidence for the promising potential of personalized data-driven decision support systems. With the increasing adoption of electronic medical record (EMR) systems, our novel medical data analytics contributes to meaningful use of EMR data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Critical Care / statistics & numerical data
  • Databases, Factual / statistics & numerical data*
  • Electronic Health Records / statistics & numerical data*
  • Female
  • Hospital Mortality / trends*
  • Humans
  • Intensive Care Units / statistics & numerical data*
  • Male
  • Middle Aged
  • Models, Statistical
  • Severity of Illness Index

Grants and funding

JL (RGPIN-2014-04743) and JD (RGPIN-2014-05911) were partially supported by Discovery Grants from the Natural Sciences and Engineering Research Council of Canada (http://www.nserc-crsng.gc.ca/index_eng.asp). JL and JD were otherwise supported by the University of Waterloo as faculty members. DM was fully supported by Queen’s University as a faculty member. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.