Kernel Methods for Predicting Yields of Chemical Reactions

J Chem Inf Model. 2022 May 9;62(9):2077-2092. doi: 10.1021/acs.jcim.1c00699. Epub 2021 Oct 26.

Abstract

The use of machine learning methods for the prediction of reaction yield is an emerging area. We demonstrate the applicability of support vector regression (SVR) for predicting reaction yields, using combinatorial data. Molecular descriptors used in regression tasks related to chemical reactivity have often been based on time-consuming, computationally demanding quantum chemical calculations, usually density functional theory. Structure-based descriptors (molecular fingerprints and molecular graphs) are quicker and easier to calculate and are applicable to any molecule. In this study, SVR models built on structure-based descriptors were compared to models built on quantum chemical descriptors. The models were evaluated along the dimension of each reaction component in a set of Buchwald-Hartwig amination reactions. The structure-based SVR models outperformed the quantum chemical SVR models, along the dimension of each reaction component. The applicability of the models was assessed with respect to similarity to training. Prospective predictions of unseen Buchwald-Hartwig reactions are presented for synthetic assessment, to validate the generalizability of the models, with particular interest along the aryl halide dimension.

Publication types

  • Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Machine Learning*
  • Prospective Studies