Large-Scale Comparison of Alternative Similarity Search Strategies with Varying Chemical Information Contents

ACS Omega. 2019 Sep 5;4(12):15304-15311. doi: 10.1021/acsomega.9b02470. eCollection 2019 Sep 17.


Similarity searching (SS) is a core approach in computational compound screening and has a long tradition in pharmaceutical research. Over the years, different approaches have been introduced to increase the information content of search calculations and optimize the ability to detect compounds having similar activity. We present a large-scale comparison of distinct search strategies on more than 600 qualifying compound activity classes. Challenging test cases for SS were identified and used to evaluate different ways to further improve search performance, which provided a differentiated view of alternative search strategies and their relative performance. It was found that search results could not only be improved by increasing compound input information but also by focusing similarity calculations on database compounds. In the presence of multiple active reference compounds, asymmetric SS with high weights on chemical features of target compounds emerged as an overall preferred approach across many different activity classes. These findings have implications for practical virtual screening applications.