External validation of a machine learning classifier to identify unhealthy alcohol use in hospitalized patients

Addiction. 2022 Apr;117(4):925-933. doi: 10.1111/add.15730. Epub 2021 Nov 23.


Background and aims: Unhealthy alcohol use (UAU) is one of the leading causes of global morbidity. A machine learning approach to alcohol screening could accelerate best practices when integrated into electronic health record (EHR) systems. This study aimed to validate externally a natural language processing (NLP) classifier developed at an independent medical center.

Design: Retrospective cohort study.

Setting: The site for validation was a midwestern United States tertiary-care, urban medical center that has an inpatient structured universal screening model for unhealthy substance use and an active addiction consult service.

Participants/cases: Unplanned admissions of adult patients between October 23, 2017 and December 31, 2019, with EHR documentation of manual alcohol screening were included in the cohort (n = 57 605).

Measurements: The Alcohol Use Disorders Identification Test (AUDIT) served as the reference standard. AUDIT scores ≥5 for females and ≥8 for males served as cases for UAU. To examine error in manual screening or under-reporting, a post hoc error analysis was conducted, reviewing discordance between the NLP classifier and AUDIT-derived reference. All clinical notes excluding the manual screening and AUDIT documentation from the EHR were included in the NLP analysis.

Findings: Using clinical notes from the first 24 hours of each encounter, the NLP classifier demonstrated an area under the receiver operating characteristic curve (AUCROC) and precision-recall area under the curve (PRAUC) of 0.91 (95% CI = 0.89-0.92) and 0.56 (95% CI = 0.53-0.60), respectively. At the optimal cut point of 0.5, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were 0.66 (95% CI = 0.62-0.69), 0.98 (95% CI = 0.98-0.98), 0.35 (95% CI = 0.33-0.38), and 1.0 (95% CI = 1.0-1.0), respectively.

Conclusions: External validation of a publicly available alcohol misuse classifier demonstrates adequate sensitivity and specificity for routine clinical use as an automated screening tool for identifying at-risk patients.

Keywords: Addiction consultation service; data science; inpatient screening; machine learning; natural language processing; unhealthy alcohol use.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Adult
  • Alcohol Drinking
  • Alcoholism* / diagnosis
  • Ethanol
  • Female
  • Humans
  • Machine Learning
  • Male
  • Natural Language Processing
  • Retrospective Studies


  • Ethanol