Two data-driven approaches to identifying the spectrum of problematic opioid use: A pilot study within a chronic pain cohort

Int J Med Inform. 2021 Dec;156:104621. doi: 10.1016/j.ijmedinf.2021.104621. Epub 2021 Oct 15.


Background: Although electronic health records (EHR) have significant potential for the study of opioid use disorders (OUD), detecting OUD in clinical data is challenging. Models using EHR data to predict OUD often rely on case/control classifications focused on extreme opioid use. There is a need to expand this work to characterize the spectrum of problematic opioid use.

Methods: Using a large academic medical center database, we developed 2 data-driven methods of OUD detection: (1) a Comorbidity Score developed from a Phenome-Wide Association Study of phenotypes associated with OUD and (2) a Text-based Score using natural language processing to identify OUD-related concepts in clinical notes. We evaluated the performance of both scores against a manual review with correlation coefficients, Wilcoxon rank sum tests, and area-under the receiver operating characteristic curves. Records with the highest Comorbidity and Text-based scores were re-evaluated by manual review to explore discrepancies.

Results: Both the Comorbidity and Text-based OUD risk scores were significantly elevated in the patients judged as High Evidence for OUD in the manual review compared to those with No Evidence (p = 1.3E-5 and 1.3E-6, respectively). The risk scores were positively correlated with each other (rho = 0.52, p < 0.001). AUCs for the Comorbidity and Text-based scores were high (0.79 and 0.76, respectively). Follow-up manual review of discrepant findings revealed strengths of data-driven methods over manual review, and opportunities for improvement in risk assessment.

Conclusion: Risk scores comprising comorbidities and text offer differing but synergistic insights into characterizing problematic opioid use. This pilot project establishes a foundation for more robust work in the future.

Keywords: Chronic pain; Electronic health records; Natural language processing; Opioid use disorder; Phenome-wide association study.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Analgesics, Opioid / therapeutic use
  • Chronic Pain* / drug therapy
  • Chronic Pain* / epidemiology
  • Humans
  • Natural Language Processing
  • Opioid-Related Disorders* / epidemiology
  • Pilot Projects


  • Analgesics, Opioid