Using machine learning to predict opioid misuse among U.S. adolescents

Prev Med. 2020 Jan:130:105886. doi: 10.1016/j.ypmed.2019.105886. Epub 2019 Nov 6.

Abstract

This study evaluated prediction performance of three different machine learning (ML) techniques in predicting opioid misuse among U.S. adolescents. Data were drawn from the 2015-2017 National Survey on Drug Use and Health (N = 41,579 adolescents, ages 12-17 years) and analyzed in 2019. Prediction models were developed using three ML algorithms, including artificial neural networks, distributed random forest, and gradient boosting machine. The performance of the ML prediction models was compared with performance of the penalized logistic regression. The area under the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC) were used as metrics of prediction performance. We used the AUPRC as the primary measure of prediction performance given that it is considered more informative for assessing binary classifiers on imbalanced outcome variable than AUROC. The overall rate of opioid misuse among U.S. adolescents was 3.7% (n = 1521). Prediction performance was similar across the four models (AUROC values range from 0.809 to 0.815). In terms of the AUPRC, the distributed random forest showed the best performance in prediction (0.172) followed by penalized logistic regression (0.162), gradient boosting machine (0.160), and artificial neural networks (0.157). Findings suggest that machine learning techniques can be a promising technique especially in the prediction of outcomes with rare cases (i.e., when the binary outcome variable is heavily lopsided) such as adolescent opioid misuse.

Keywords: Distributed random forest; Machine learning; Opioid misuse; Penalized logistic regression; Substance use.

Publication types

  • Comparative Study

MeSH terms

  • Adolescent
  • Adolescent Behavior
  • Algorithms
  • Area Under Curve
  • Child
  • Female
  • Humans
  • Logistic Models
  • Machine Learning*
  • Male
  • Opioid-Related Disorders / epidemiology*
  • Risk Assessment / methods
  • Surveys and Questionnaires
  • United States / epidemiology