A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation

Proc Natl Acad Sci U S A. 2008 Jan 8;105(1):129-34. doi: 10.1073/pnas.0707684105. Epub 2007 Dec 28.

Abstract

The detection of ligand-binding sites is often the starting point for protein function identification and drug discovery. Because of inaccuracies in predicted protein structures, extant binding pocket-detection methods are limited to experimentally solved structures. Here, FINDSITE, a method for ligand-binding site prediction and functional annotation based on binding-site similarity across groups of weakly homologous template structures identified from threading, is described. For crystal structures, considering a cutoff distance of 4 A as the hit criterion, the success rate is 70.9% for identifying the best of top five predicted ligand-binding sites with a ranking accuracy of 76.0%. Both high prediction accuracy and ability to correctly rank identified binding sites are sustained when approximate protein models (<35% sequence identity to the closest template structure) are used, showing a 67.3% success rate with 75.5% ranking accuracy. In practice, FINDSITE tolerates structural inaccuracies in protein models up to a rmsd from the crystal structure of 8-10 A. This is because analysis of weakly homologous protein models reveals that about half have a rmsd from the native binding site <2 A. Furthermore, the chemical properties of template-bound ligands can be used to select ligand templates associated with the binding site. In most cases, FINDSITE can accurately assign a molecular function to the protein model.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Binding Sites
  • Biophysics / methods*
  • Computational Biology / methods*
  • Crystallography, X-Ray / methods
  • Ligands
  • Models, Molecular
  • Models, Statistical
  • Molecular Conformation
  • Protein Binding
  • Protein Conformation
  • Protein Interaction Mapping
  • Proteins / chemistry
  • Reproducibility of Results
  • Software*

Substances

  • Ligands
  • Proteins