Prediction of mass spectrometry ionization efficiency based on COSMO-RS and machine learning algorithms

Analyst. 2024 Apr 17. doi: 10.1039/d4an00301b. Online ahead of print.

Abstract

Non-targeted analysis of high-resolution mass spectrometry (MS) can identify thousands of compounds, which also gives a huge challenge to their quantification. The aim of this study is to investigate the impact of mass spectrometry ionization efficiency on various compounds in food at different solvent ratios and to develop a predictive model for mass spectrometry ionization efficiency to enable non-targeted quantitative prediction of unknown compounds. This study covered 70 compounds in 14 different mobile phase ratio environments in positive ion mode to analyze the rules of the matrix effect. With the organic phase ratio from low to high, most compounds changed by 1.0 log units in log IE. The addition of formic acid enhanced the signal but also promoted the matrix effect, which often occurred in compounds with strong ionization capacity. It was speculated that the matrix effect was mainly in the form of competitive charge and charged droplet' gasification sites during MS detection. Subsequently, we present a log IE prediction method built using the COSMO-RS software and the artificial neural network (ANN) algorithm to address this difficulty and overcome the shortcomings of previous models, which always ignore the matrix effect. This model was developed following the principles of QSAR modeling recommended by the Organization for Economic Cooperation and Development (OECD). Furthermore, we validated this approach by predicting the log IE of 70 compounds, including those not involved in the log IE model development. The results presented demonstrate that the method we put forward has an excellent prediction accuracy for log IE (R2pred = 0.880), which means that it has the potential to predict the log IE of new compounds without authentic standards.