Docking molecules by families to increase the diversity of hits in database screens: computational strategy and experimental evaluation

Proteins. 2001 Feb 1;42(2):279-93. doi: 10.1002/1097-0134(20010201)42:2<279::aid-prot150>;2-u.


Molecular docking programs screen chemical databases for novel ligands that fit protein binding sites. When one compound fits the site well, close analogs typically do the same. Therefore, many of the compounds that are found in such screens resemble one another. This reduces the variety and novelty of the compounds suggested. In an attempt to increase the diversity of docking hit lists, the Available Chemicals Directory was grouped into families of related structures. All members of every family were docked and scored, but only the best scoring molecule of a high-ranking family was allowed in the hit list. The identity and scores of the other members of these families were recorded as annotations to the best family member, but they were not independently ranked. This family-based docking method was compared with molecule-by-molecule docking in screens against the structures of thymidylate synthase, dihydrofolate reductase (DHFR), and the cavity site of the mutant T4 lysozyme Leu99 --> Ala (L99A). In each case, the diversity of the hit list increased, and more families of known ligands were found. To investigate whether the newly identified hits were sensible, we tested representative examples experimentally for binding to L99A and DHFR. Of the six compounds tested against L99A, five bound to the internal cavity. Of the seven compounds tested against DHFR, six inhibited the enzyme with apparent K(i) values between 0.26 and 100 microM. The segregation of potential ligands into families of related molecules is a simple technique to increase the diversity of candidates suggested by database screens. The general approach should be applicable to most docking methods. Proteins 2001;42:279-293.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Binding Sites
  • Computational Biology / methods
  • Databases, Factual*
  • Enzyme Inhibitors / chemistry
  • Information Storage and Retrieval*
  • Ligands
  • Tetrahydrofolate Dehydrogenase / chemistry*
  • Thymidylate Synthase / antagonists & inhibitors
  • Thymidylate Synthase / chemistry*


  • Enzyme Inhibitors
  • Ligands
  • Tetrahydrofolate Dehydrogenase
  • Thymidylate Synthase