Support vector machines for predictive modeling in heterogeneous catalysis: a comprehensive introduction and overfitting investigation based on two real applications

J Comb Chem. 2006 Jul-Aug;8(4):583-96. doi: 10.1021/cc050093m.

Abstract

This works provides an introduction to support vector machines (SVMs) for predictive modeling in heterogeneous catalysis, describing step by step the methodology with a highlighting of the points which make such technique an attractive approach. We first investigate linear SVMs, working in detail through a simple example based on experimental data derived from a study aiming at optimizing olefin epoxidation catalysts applying high-throughput experimentation. This case study has been chosen to underline SVM features in a visual manner because of the few catalytic variables investigated. It is shown how SVMs transform original data into another representation space of higher dimensionality. The concepts of Vapnik-Chervonenkis dimension and structural risk minimization are introduced. The SVM methodology is evaluated with a second catalytic application, that is, light paraffin isomerization. Finally, we discuss why SVMs is a strategic method, as compared to other machine learning techniques, such as neural networks or induction trees, and why emphasis is put on the problem of overfitting.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alkenes / chemistry*
  • Catalysis
  • Database Management Systems*
  • Databases, Factual
  • Forecasting*
  • Isomerism
  • Models, Chemical
  • Neural Networks, Computer
  • Oxidation-Reduction
  • Pattern Recognition, Automated / methods*

Substances

  • Alkenes