Improving chemical similarity ensemble approach in target prediction

J Cheminform. 2016 Apr 23:8:20. doi: 10.1186/s13321-016-0130-x. eCollection 2016.

Abstract

Background: In silico target prediction of compounds plays an important role in drug discovery. The chemical similarity ensemble approach (SEA) is a promising method, which has been successfully applied in many drug-related studies. There are various models available analogous to SEA, because this approach is based on different types of molecular fingerprints. To investigate the influence of training data selection and the complementarity of different models, several SEA models were constructed and tested.

Results: When we used a test set of 37,138 positive and 42,928 negative ligand-target interactions, among the five tested molecular fingerprint methods, at significance level 0.05, Topological-based model yielded the best precision rate (83.7 %) and [Formula: see text] (0.784) while Atom pair-based model yielded the best [Formula: see text] (0.694). By employing an election system to combine the five models, a flexible prediction scheme was achieved with precision range from 71 to 90.6 %, [Formula: see text] range from 0.663 to 0.684 and [Formula: see text] range from 0.696 to 0.817.

Conclusions: The overall effectiveness of all of the five models could be ranked in decreasing order as follows: Atom pair [Formula: see text] Topological > Morgan > MACCS > Pharmacophore. Combining multiple SEA models, which takes advantages of different models, could be used to improve the success rates of the models. Another possibility of improving the model could be using target-specific classes or more active compounds.

Keywords: Fingerprint; Off-target effect; Similarity; Target identification.