Encoding protein-ligand interaction patterns in fingerprints and graphs

J Chem Inf Model. 2013 Mar 25;53(3):623-37. doi: 10.1021/ci300566n. Epub 2013 Mar 6.


We herewith present a novel and universal method to convert protein-ligand coordinates into a simple fingerprint of 210 integers registering the corresponding molecular interaction pattern. Each interaction (hydrophobic, aromatic, hydrogen bond, ionic bond, metal complexation) is detected on the fly and physically described by a pseudoatom centered either on the interacting ligand atom, the interacting protein atom, or the geometric center of both interacting atoms. Counting all possible triplets of interaction pseudoatoms within six distance ranges, and pruning the full integer vector to keep the most frequent triplets enables the definition of a simple (210 integers) and coordinate frame-invariant interaction pattern descriptor (TIFP) that can be applied to compare any pair of protein-ligand complexes. TIFP fingerprints have been calculated for ca. 10,000 druggable protein-ligand complexes therefore enabling a wide comparison of relationships between interaction pattern similarity and ligand or binding site pairwise similarity. We notably show that interaction pattern similarity strongly depends on binding site similarity. In addition to the TIFP fingerprint which registers intermolecular interactions between a ligand and its target protein, we developed two tools (Ishape, Grim) to align protein-ligand complexes from their interaction patterns. Ishape is based on the overlap of interaction pseudoatoms using a smooth Gaussian function, whereas Grim utilizes a standard clique detection algorithm to match interaction pattern graphs. Both tools are complementary and enable protein-ligand complex alignments capitalizing on both global and local pattern similarities. The new fingerprint and companion alignment tools have been successfully used in three scenarios: (i) interaction-biased alignment of protein-ligand complexes, (ii) postprocessing docking poses according to known interaction patterns for a particular target, and (iii) virtual screening for bioisosteric scaffolds sharing similar interaction patterns.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Binding Sites
  • Conserved Sequence
  • Crystallography, X-Ray
  • Hydrogen Bonding
  • Ligands
  • Models, Molecular
  • Normal Distribution
  • Peptide Fragments / chemistry
  • Peptide Mapping / methods*
  • Protein Binding
  • Protein Conformation
  • Proteins / chemistry*
  • ROC Curve


  • Ligands
  • Peptide Fragments
  • Proteins