Introduction: Machine learning algorithms such as elastic net regression and backward selection provide a unique and powerful approach to model building given a set of psychosocial predictors of smoking lapse measured repeatedly via ecological momentary assessment (EMA). Understanding these predictors may aid in developing interventions for smoking lapse prevention.
Methods: In a randomized-controlled smoking cessation trial, smartphone-based EMAs were collected from 92 participants following a scheduled quit date. This secondary analysis utilized elastic net-penalized cox proportional hazards regression and model approximation via backward elimination to (1) optimize a predictive model of time to first lapse and (2) simplify that model to its core constituent predictors to maximize parsimony and generalizability.
Results: Elastic net proportional hazards regression selected 17 of 26 possible predictors from 2065 EMAs to model time to first lapse. The predictors with the highest magnitude regression coefficients were having consumed alcohol in the past hour, being around and interacting with a smoker, and having cigarettes easily available. This model was reduced using backward elimination, retaining five predictors and approximating to 93.9% of model fit. The retained predictors included those mentioned above as well as feeling irritable and being in areas where smoking is either discouraged or allowed (as opposed to not permitted).
Conclusions: The strongest predictors of smoking lapse were environmental in nature (e.g., being in smoking-permitted areas) as opposed to internal factors such as psychological affect. Interventions may be improved by a renewed focus of interventions on these predictors.
Implications: The present study demonstrated the utility of machine learning algorithms to optimize the prediction of time to smoking lapse using EMA data. The two models generated by the present analysis found that environmental factors were most strongly related to smoking lapse. The results support the use of machine learning algorithms to investigate intensive longitudinal data, and provide a foundation for the development of highly tailored, just-in-time interventions that can target on multiple antecedents of smoking lapse.