A machine learning approach for handling big data produced by high resolution mass spectrometry after data independent acquisition of small molecules - Proof of concept study using an artificial neural network for sample classification

Drug Test Anal. 2020 Jun;12(6):836-845. doi: 10.1002/dta.2775. Epub 2020 Feb 6.

Abstract

Liquid chromatography coupled to high-resolution mass spectrometry (HRMS) enables data independent acquisition (DIA) and untargeted screening. However, to avoid the handling of the resulting large dataset, most laboratories in that field still use targeted screening methods, which offer good sensitivity and specificity but are limited to known compounds. The promising field of machine learning offers new possibilities such as artificial neural networks that can be trained to classify large amounts of data. In this proof of concept study, we exemplify such a machine learning approach for raw HRMS-DIA data files. We evaluated a machine learning model using training, validation, and test sets of solvent and whole blood samples containing drugs (of abuse) common in forensic toxicology. For that purpose, different platforms were used. With a feedforward neural network model architecture, a category prediction (blank sample vs. drug containing sample) was aimed for. With the applied machine learning approaches, the sensitivity and specificity, of the validation and test set, for the prediction of sample classes were in a suitable range for an actual use in a (routine) laboratory (e.g. workplace drug testing). In conclusion, this proof of concept study clearly demonstrated the huge potential of machine learning in the analysis of HRMS-DIA data.

Keywords: SWATH; data independent acquisition; high resolution mass spectrometry; machine learning; small molecules.

MeSH terms

  • Big Data*
  • Chromatography, Liquid
  • Cocaine / blood
  • Humans
  • Machine Learning*
  • Mass Spectrometry / statistics & numerical data*
  • Neural Networks, Computer*
  • Proof of Concept Study
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Small Molecule Libraries
  • Substance Abuse Detection / methods
  • Zolpidem / blood

Substances

  • Small Molecule Libraries
  • Zolpidem
  • Cocaine