Background: This longitudinal study explored the utility of machine learning (ML) methodology in predicting the trajectory of severity of substance use from childhood to thirty years of age using a set of psychological and health characteristics.
Design: Boys (N = 494) and girls (N = 206) were recruited using a high-risk paradigm at 10-12 years of age and followed up at 12-14, 16, 19, 22, 25 and 30 years of age.
Measurements: At each visit, the subjects were administered a comprehensive battery to measure psychological makeup, health status, substance use and psychiatric disorder, and their overall harmfulness of substance consumption was quantified according to the multidimensional criteria (physical, dependence, and social) developed by Nutt et al. (2007). Next, high- and low- substance use severity trajectories were derived differentially associated with probability of segueing to substance use disorder (SUD). ML methodology was employed to predict trajectory membership.
Findings: The high-severity trajectory group had a higher probability of leading to SUD than the low-severity trajectory (89.0% vs 32.4%; odds ratio = 16.88, p < 0.0001). Thirty psychological and health status items at each of the six visits predict membership in the high- or low-severity trajectory, with 71% accuracy at 10-12 years of age, increasing to 93% at 22 years of age.
Conclusion: These findings demonstrate the applicability of the machine learning methodology for detecting membership in a substance use trajectory with high probability of culminating in SUD, potentially informing primary and secondary prevention.
Keywords: Machine learning; Random Forest; Substance misuse prevention; Substance use disorder; Trajectory analysis.
Copyright © 2019 Elsevier B.V. All rights reserved.