Diversifying chemical libraries with generative topographic mapping

J Comput Aided Mol Des. 2020 Jul;34(7):805-815. doi: 10.1007/s10822-019-00215-x. Epub 2019 Aug 12.


Generative topographic mapping was used to investigate the possibility to diversify the in-house compounds collection of Boehringer Ingelheim (BI). For this purpose, a 2D map covering the relevant chemical space was trained, and the BI compound library was compared to the Aldrich-Market Select (AMS) database of more than 8M purchasable compounds. In order to discover new (sub)structures, the "AutoZoom" tool was developed and applied in order to analyze chemotypes of molecules residing in heavily populated zones of a map and to extract the corresponding maximum common substructures. A set of 401K new structures from the AMS database was retrieved and checked for drug-likeness and biological activity.

Keywords: Big data; Chemical library diversity enrichment; Generative topographic mapping.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computer-Aided Design / statistics & numerical data
  • Databases, Chemical / statistics & numerical data
  • Databases, Pharmaceutical / statistics & numerical data
  • Drug Design
  • Drug Development / statistics & numerical data
  • Drug Discovery / methods*
  • Drug Discovery / statistics & numerical data
  • Humans
  • Molecular Structure
  • Small Molecule Libraries*
  • Software
  • User-Computer Interface


  • Small Molecule Libraries