QSAR Models for Active Substances against Pseudomonas aeruginosa Using Disk-Diffusion Test Data

Molecules. 2021 Mar 19;26(6):1734. doi: 10.3390/molecules26061734.


Pseudomonas aeruginosa is a Gram-negative bacillus included among the six "ESKAPE" microbial species with an outstanding ability to "escape" currently used antibiotics and developing new antibiotics against it is of the highest priority. Whereas minimum inhibitory concentration (MIC) values against Pseudomonas aeruginosa have been used previously for QSAR model development, disk diffusion results (inhibition zones) have not been apparently used for this purpose in the literature and we decided to explore their use in this sense. We developed multiple QSAR methods using several machine learning algorithms (support vector classifier, K nearest neighbors, random forest classifier, decision tree classifier, AdaBoost classifier, logistic regression and naïve Bayes classifier). We used four sets of molecular descriptors and fingerprints and three different methods of data balancing, together with the "native" data set. In total, 32 models were built for each set of descriptors or fingerprint and balancing method, of which 28 were selected and stacked to create meta-models. In terms of balanced accuracy, the best performance was provided by KNN, logistic regression and decision tree classifier, but the ensemble method had slightly superior results in nested cross-validation.

Keywords: AdaBoost; KNN; QSAR; antimicrobial; chemical descriptors; machine-learning; pseudomonas; support vector classifier.

MeSH terms

  • Anti-Bacterial Agents / pharmacology*
  • Disk Diffusion Antimicrobial Tests*
  • Models, Biological*
  • Pseudomonas aeruginosa / growth & development*
  • Quantitative Structure-Activity Relationship
  • Support Vector Machine*


  • Anti-Bacterial Agents