Validating new tuberculosis computational models with public whole cell screening aerobic activity datasets

Pharm Res. 2011 Aug;28(8):1859-69. doi: 10.1007/s11095-011-0413-x. Epub 2011 Mar 10.


Purpose: The search for small molecules with activity against Mycobacterium tuberculosis (Mtb) increasingly uses high throughput screening and computational methods. Several public datasets from the Collaborative Drug Discovery Tuberculosis (CDD TB) database have been evaluated with cheminformatics approaches to validate their utility and suggest compounds for testing.

Methods: Previously reported Bayesian classification models were used to predict a set of 283 Novartis compounds tested against Mtb (containing aerobic and anaerobic hits) and to search FDA approved drugs. The Novartis compounds were also filtered with computational SMARTS alerts to identify potentially undesirable substructures.

Results: Using the Novartis compounds as a test set for the Bayesian models demonstrated a >4.0-fold enrichment over random screening for finding aerobic hits not in the computational models (N = 34). A 10-fold enrichment was observed for finding Mtb active compounds in the FDA drugs database. 85.9% of the Novartis compounds failed the Abbott SMARTS alerts, a value substantially higher than for known TB drugs. Higher levels of failures of SMARTS filters from different groups also correlate with the number of Lipinski violations.

Conclusions: These computational approaches may assist in finding desirable leads for Tuberculosis drug discovery.

MeSH terms

  • Antitubercular Agents / chemistry
  • Antitubercular Agents / pharmacology*
  • Bayes Theorem
  • Computer Simulation
  • Databases, Factual
  • Drug Discovery / methods
  • Models, Biological*
  • Mycobacterium tuberculosis / drug effects
  • Small Molecule Libraries / chemistry
  • Small Molecule Libraries / pharmacology*
  • Tuberculosis / drug therapy*


  • Antitubercular Agents
  • Small Molecule Libraries