Improved Prediction of Aqueous Solubility of Novel Compounds by Going Deeper With Deep Learning

Front Oncol. 2020 Feb 11:10:121. doi: 10.3389/fonc.2020.00121. eCollection 2020.

Abstract

Aqueous solubility is an important physicochemical property of compounds in anti-cancer drug discovery. Artificial intelligence solubility prediction tools have scored impressive performances by employing regression, machine learning, and deep learning methods. The reported performances vary significantly partly because of the different datasets used. Solubility prediction on novel compounds needs to be improved, which may be achieved by going deeper with deep learning. We constructed deeper-net models of ~20-layer modified ResNet convolutional neural network architecture, which were trained and tested with 9,943 compounds encoded by molecular fingerprints. Retrospectively tested by 62 recently-published novel compounds, one deeper-net model outperformed four established tools, shallow-net models, and four human experts. Deeper-net models also outperformed others in predicting the solubility values of a series of novel compounds newly-synthesized for anti-cancer drug discovery. Solubility prediction may be improved by going deeper with deep learning. Our deeper-net models are accessible at http://www.npbdb.net/solubility/index.jsp.

Keywords: anti-cancer drug discovery; aqueous solubility; artificial intelligence; chemical; compounds; deep learning.