The Virtual Screening of the Drug Protein with a Few Crystal Structures Based on the Adaboost-SVM

Comput Math Methods Med. 2016:2016:4809831. doi: 10.1155/2016/4809831. Epub 2016 Apr 3.

Abstract

Using the theory of machine learning to assist the virtual screening (VS) has been an effective plan. However, the quality of the training set may reduce because of mixing with the wrong docking poses and it will affect the screening efficiencies. To solve this problem, we present a method using the ensemble learning to improve the support vector machine to process the generated protein-ligand interaction fingerprint (IFP). By combining multiple classifiers, ensemble learning is able to avoid the limitations of the single classifier's performance and obtain better generalization. According to the research of virtual screening experiment with SRC and Cathepsin K as the target, the results show that the ensemble learning method can effectively reduce the error because the sample quality is not high and improve the effect of the whole virtual screening process.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Area Under Curve
  • Binding Sites
  • Cathepsin K / chemistry
  • Chemistry, Pharmaceutical / methods
  • Combinatorial Chemistry Techniques
  • Computational Biology / methods
  • Crystallization
  • Databases, Protein
  • Drug Design*
  • Humans
  • Hydrogen / chemistry
  • Imaging, Three-Dimensional
  • Ligands
  • Models, Statistical
  • Protein Binding
  • Protein Conformation
  • Proteins / chemistry*
  • ROC Curve
  • Reproducibility of Results
  • Software
  • Support Vector Machine*

Substances

  • Ligands
  • Proteins
  • Hydrogen
  • Cathepsin K