DrugQuest - a text mining workflow for drug association discovery

BMC Bioinformatics. 2016 Jun 6;17 Suppl 5(Suppl 5):182. doi: 10.1186/s12859-016-1041-6.


Background: Text mining and data integration methods are gaining ground in the field of health sciences due to the exponential growth of bio-medical literature and information stored in biological databases. While such methods mostly try to extract bioentity associations from PubMed, very few of them are dedicated in mining other types of repositories such as chemical databases.

Results: Herein, we apply a text mining approach on the DrugBank database in order to explore drug associations based on the DrugBank "Description", "Indication", "Pharmacodynamics" and "Mechanism of Action" text fields. We apply Name Entity Recognition (NER) techniques on these fields to identify chemicals, proteins, genes, pathways, diseases, and we utilize the TextQuest algorithm to find additional biologically significant words. Using a plethora of similarity and partitional clustering techniques, we group the DrugBank records based on their common terms and investigate possible scenarios why these records are clustered together. Different views such as clustered chemicals based on their textual information, tag clouds consisting of Significant Terms along with the terms that were used for clustering are delivered to the user through a user-friendly web interface.

Conclusions: DrugQuest is a text mining tool for knowledge discovery: it is designed to cluster DrugBank records based on text attributes in order to find new associations between drugs. The service is freely available at http://bioinformatics.med.uoc.gr/drugquest .

Keywords: Chemicals; Data integration; Document clustering; Drug associations; Knowledge discovery; Name entity recognition; Text mining.

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Databases, Factual
  • Drug Discovery*
  • Humans
  • Internet
  • Pharmaceutical Preparations / chemistry
  • Pharmaceutical Preparations / metabolism
  • User-Computer Interface*


  • Pharmaceutical Preparations