ChemmineR: a compound mining framework for R

Bioinformatics. 2008 Aug 1;24(15):1733-4. doi: 10.1093/bioinformatics/btn307. Epub 2008 Jul 2.


Motivation: Software applications for structural similarity searching and clustering of small molecules play an important role in drug discovery and chemical genomics. Here, we present the first open-source compound mining framework for the popular statistical programming environment R. The integration with a powerful statistical environment maximizes the flexibility, expandability and programmability of the provided analysis functions.

Results: We discuss the algorithms and compound mining utilities provided by the R package ChemmineR. It contains functions for structural similarity searching, clustering of compound libraries with a wide spectrum of classification algorithms and various utilities for managing complex compound data. It also offers a wide range of visualization functions for compound clusters and chemical structures. The package is well integrated with the online ChemMine environment and allows bidirectional communications between the two services.

Availability: ChemmineR is freely available as an R package from the ChemMine project site:

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Biopolymers / chemistry*
  • Database Management Systems*
  • Databases, Factual*
  • Information Storage and Retrieval / methods*
  • Programming Languages*
  • Software*


  • Biopolymers