New Approach Combining Molecular Fingerprints and Machine Learning to Estimate Relative Ionization Efficiency in Electrospray Ionization

ACS Omega. 2020 Apr 14;5(16):9510-9516. doi: 10.1021/acsomega.0c00732. eCollection 2020 Apr 28.

Abstract

Electrospray ionization (ESI) is widely used as an ionization source for the analysis of complex mixtures by mass spectrometry. However, different compounds ionize more or less effectively in the ESI source, meaning instrument responses can vary by orders of magnitude, often in hard-to-predict ways. This precludes the use of ESI for quantitative analysis where authentic standards are not available. Relative ionization efficiency (RIE) scales have been proposed as a route to predict the response of compounds in ESI. In this work, a scale of RIEs was constructed for 51 carboxylic acids, spanning a wide range of additional functionalities, to produce a model for predicting the RIE of unknown compounds. While using a limited number of compounds, we explore the usefulness of building a predictor using popular supervised regression techniques, encoding the compounds as combinations of different structural features using a range of common "fingerprints". It was found that Bayesian ridge regression gives the best predictive model, encoding compounds using features designed for activity coefficient models. This produced a predictive model with an R 2 score of 0.62 and a root-mean-square error (RMSE) of 0.362. Such scores are comparable to those obtained in previous studies but without the requirement to first measure or predict the physical properties of the compounds, potentially reducing the time required to make predictions.