Assessing clinical heterogeneity in sepsis through treatment patterns and machine learning

J Am Med Inform Assoc. 2019 Dec 1;26(12):1466-1477. doi: 10.1093/jamia/ocz106.

Abstract

Objective: To use unsupervised topic modeling to evaluate heterogeneity in sepsis treatment patterns contained within granular data of electronic health records.

Materials and methods: A multicenter, retrospective cohort study of 29 253 hospitalized adult sepsis patients between 2010 and 2013 in Northern California. We applied an unsupervised machine learning method, Latent Dirichlet Allocation, to the orders, medications, and procedures recorded in the electronic health record within the first 24 hours of each patient's hospitalization to uncover empiric treatment topics across the cohort and to develop computable clinical signatures for each patient based on proportions of these topics. We evaluated how these topics correlated with common sepsis treatment and outcome metrics including inpatient mortality, time to first antibiotic, and fluids given within 24 hours.

Results: Mean age was 70 ± 17 years with hospital mortality of 9.6%. We empirically identified 42 clinically recognizable treatment topics (eg, pneumonia, cellulitis, wound care, shock). Only 43.1% of hospitalizations had a single dominant topic, and a small minority (7.3%) had a single topic comprising at least 80% of their overall clinical signature. Across the entire sepsis cohort, clinical signatures were highly variable.

Discussion: Heterogeneity in sepsis is a major barrier to improving targeted treatments, yet existing approaches to characterizing clinical heterogeneity are narrowly defined. A machine learning approach captured substantial patient- and population-level heterogeneity in treatment during early sepsis hospitalization.

Conclusion: Using topic modeling based on treatment patterns may enable more precise clinical characterization in sepsis and better understanding of variability in sepsis presentation and outcomes.

Keywords: infection; latent Dirichlet allocation; machine learning; topic modeling; treatment heterogeneity.

Publication types

  • Multicenter Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Anti-Bacterial Agents / therapeutic use
  • Electronic Health Records*
  • Female
  • Hospital Mortality
  • Hospitalization
  • Humans
  • Male
  • Middle Aged
  • Patient Acuity
  • Quality of Health Care
  • Retrospective Studies
  • Sepsis / complications
  • Sepsis / mortality
  • Sepsis / therapy*
  • Unsupervised Machine Learning*

Substances

  • Anti-Bacterial Agents