Utilization of Machine Learning for the Differentiation of Positional NPS Isomers with Direct Analysis in Real Time Mass Spectrometry

Jennifer L Bonetti; Saer Samanipour; Arian C van Asten

doi:10.1021/acs.analchem.1c04985

Utilization of Machine Learning for the Differentiation of Positional NPS Isomers with Direct Analysis in Real Time Mass Spectrometry

Anal Chem. 2022 Mar 29;94(12):5029-5040. doi: 10.1021/acs.analchem.1c04985. Epub 2022 Mar 17.

Authors

Jennifer L Bonetti^{1

2}, Saer Samanipour¹, Arian C van Asten^{1

3}

Affiliations

¹ Van't Hoff Institute for Molecular Sciences, University of Amsterdam, P.O. Box 94157, Amsterdam 1090 GD, The Netherlands.
² Virginia Department of Forensic Science, Norfolk, Virginia 23606, United States.
³ Co van Ledden Hulsebosch Center (CLHC), Amsterdam Center for Forensic Science and Medicine, 1098 XH Amsterdam, The Netherlands.

Abstract

The differentiation of positional isomers is a well established analytical challenge for forensic laboratories. As more novel psychoactive substances (NPSs) are introduced to the illicit drug market, robust yet efficient methods of isomer identification are needed. Although current literature suggests that Direct Analysis in Real Time-Time-of-Flight mass spectrometry (DART-ToF) with in-source collision induced dissociation (is-CID) can be used to differentiate positional isomers, it is currently unclear whether this capability extends to positional isomers whose only structural difference is the precise location of a single substitution on an aromatic ring. The aim of this work was to determine whether chemometric analysis of DART-ToF data could offer forensic laboratories an alternative rapid and robust method of differentiating NPS positional ring isomers. To test the feasibility of this technique, three positional isomer sets (fluoroamphetamine, fluoromethamphetamine, and methylmethcathinone) were analyzed. Using a linear rail for consistent sample introduction, the three isomers of each type were analyzed 96 times over an eight-week timespan. The classification methods investigated included a univariate approach, the Welch t test at each included ion; a multivariate approach, linear discriminant analysis; and a machine learning approach, the Random Forest classifier. For each method, multiple validation techniques were used including restricting the classifier to data that was only generated on one day. Of these classification methods, the Random Forest algorithm was ultimately the most accurate and robust, consistently achieving out-of-bag error rates below 5%. At an inconclusive rate of approximately 5%, a success rate of 100% was obtained for isomer identification when applied to a randomly selected test set. The model was further tested with data acquired as a part of a different batch. The highest classification success rate was 93.9%, and error rates under 5% were consistently achieved.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Isomerism
Machine Learning*
Mass Spectrometry / methods