Virtual screening of Abl inhibitors from large compound libraries by support vector machines

J Chem Inf Model. 2009 Sep;49(9):2101-10. doi: 10.1021/ci900135u.


Abl promotes cancers by regulating cell morphogenesis, motility, growth, and survival. Successes of several marketed and clinical trial Abl inhibitors against leukemia and other cancers and appearances of reduced efficacies and drug resistances have led to significant interest in and efforts for developing new Abl inhibitors. In silico methods of pharmacophore, fragment, and molecular docking have been used in some of these efforts. It is desirable to explore other in silico methods capable of searching large compound libraries at high yields and reduced false-hit rates. We evaluated support vector machines (SVM) as a virtual screening tool for searching Abl inhibitors from large compound libraries. SVM trained and tested by 708 inhibitors and 65,494 putative noninhibitors correctly identified 84.4 to 92.3% inhibitors and 99.96 to 99.99% noninhibitors in 5-fold cross validation studies. SVM trained by 708 pre-2008 inhibitors and 65 494 putative noninhibitors correctly identified 50.5% of the 91 inhibitors reported since 2008 and predicted as inhibitors 29,072 (0.21%) of 13.56M PubChem, 659 (0.39%) of 168K MDDR, and 330 (5.0%) of 6638 MDDR compounds similar to the known inhibitors. SVM showed comparable yields and substantially reduced false-hit rates against two similarity based and another machine learning VS methods based on the same training and testing data sets and molecular descriptors. These suggest that SVM is capable of searching Abl inhibitors from large compound libraries at low false-hit rates.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence*
  • Databases, Factual
  • Drug Evaluation, Preclinical / methods*
  • Protein Kinase Inhibitors / analysis*
  • Protein Kinase Inhibitors / chemistry
  • Protein Kinase Inhibitors / pharmacology*
  • Proto-Oncogene Proteins c-abl / antagonists & inhibitors*
  • Reproducibility of Results
  • User-Computer Interface*


  • Protein Kinase Inhibitors
  • Proto-Oncogene Proteins c-abl