Benchmarking DFT and Supervised Machine Learning: An Organic Semiconducting Polymer Investigation

J Phys Chem A. 2024 Feb 1;128(4):709-715. doi: 10.1021/acs.jpca.3c04905. Epub 2024 Jan 23.

Abstract

Using a training set consisting of twenty-two well-known semiconducting organic polymers, we studied the ability of a simple linear regression supervised machine learning algorithm to accurately predict the bandgap (BG) and ionization potential (IP) of new polymers. We show that using the PBE or PW91 exchange-correlation functionals and this simple linear regression, calculated BGs and IPs can be obtained with average percent errors of less than 3 and 4%, respectively. We then apply this method to predict the BG and IP of a group of new polymers composed of monomers used in the training set and their derivatives in AABB and ABAB orientations.