FINDSITE(comb): a threading/structure-based, proteomic-scale virtual ligand screening approach

J Chem Inf Model. 2013 Jan 28;53(1):230-40. doi: 10.1021/ci300510n. Epub 2012 Dec 28.


Virtual ligand screening is an integral part of the modern drug discovery process. Traditional ligand-based, virtual screening approaches are fast but require a set of structurally diverse ligands known to bind to the target. Traditional structure-based approaches require high-resolution target protein structures and are computationally demanding. In contrast, the recently developed threading/structure-based FINDSITE-based approaches have the advantage that they are as fast as traditional ligand-based approaches and yet overcome the limitations of traditional ligand- or structure-based approaches. These new methods can use predicted low-resolution structures and infer the likelihood of a ligand binding to a target by utilizing ligand information excised from the target's remote or close homologous proteins and/or libraries of ligand binding databases. Here, we develop an improved version of FINDSITE, FINDSITE(filt), that filters out false positive ligands in threading identified templates by a better binding site detection procedure that includes information about the binding site amino acid similarity. We then combine FINDSITE(filt) with FINDSITE(X) that uses publicly available binding databases ChEMBL and DrugBank for virtual ligand screening. The combined approach, FINDSITE(comb), is compared to two traditional docking methods, AUTODOCK Vina and DOCK 6, on the DUD benchmark set. It is shown to be significantly better in terms of enrichment factor, dependence on target structure quality, and speed. FINDSITE(comb) is then tested for virtual ligand screening on a large set of 3576 generic targets from the DrugBank database as well as a set of 168 Human GPCRs. Excluding close homologues, FINDSITE(comb) gives an average enrichment factor of 52.1 for generic targets and 22.3 for GPCRs within the top 1% of the screened compound library. Around 65% of the targets have better than random enrichment factors. The performance is insensitive to target structure quality, as long as it has a TM-score ≥ 0.4 to native. Thus, FINDSITE(comb) makes the screening of millions of compounds across entire proteomes feasible. The FINDSITE(comb) web service is freely available for academic users at

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Binding Sites
  • Databases, Protein
  • Drug Evaluation, Preclinical / methods*
  • Humans
  • Ligands
  • Models, Molecular
  • Molecular Docking Simulation
  • Protein Conformation
  • Proteomics / methods*
  • Receptors, G-Protein-Coupled / chemistry
  • Receptors, G-Protein-Coupled / metabolism
  • User-Computer Interface*


  • Ligands
  • Receptors, G-Protein-Coupled