Prediction of Compound Profiling Matrices Using Machine Learning

Raquel Rodríguez-Pérez; Tomoyuki Miyao; Swarit Jasial; Martin Vogt; Jürgen Bajorath

doi:10.1021/acsomega.8b00462

Prediction of Compound Profiling Matrices Using Machine Learning

ACS Omega. 2018 Apr 30;3(4):4713-4723. doi: 10.1021/acsomega.8b00462.

Authors

Raquel Rodríguez-Pérez¹, Tomoyuki Miyao¹, Swarit Jasial¹, Martin Vogt¹, Jürgen Bajorath¹

Affiliation

¹ Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany.

Abstract

Screening of compound libraries against panels of targets yields profiling matrices. Such matrices typically contain structurally diverse screening compounds, large numbers of inactives, and small numbers of hits per assay. As such, they represent interesting and challenging test cases for computational screening and activity predictions. In this work, modeling of large compound profiling matrices was attempted that were extracted from publicly available screening data. Different machine learning methods including deep learning were compared and different prediction strategies explored. Prediction accuracy varied for assays with different numbers of active compounds, and alternative machine learning approaches often produced comparable results. Deep learning did not further increase the prediction accuracy of standard methods such as random forests or support vector machines. Target-based random forest models were prioritized and yielded successful predictions of active compounds for many assays.