Developing an in silico minimum inhibitory concentration panel test for Klebsiella pneumoniae

Sci Rep. 2018 Jan 11;8(1):421. doi: 10.1038/s41598-017-18972-w.


Antimicrobial resistant infections are a serious public health threat worldwide. Whole genome sequencing approaches to rapidly identify pathogens and predict antibiotic resistance phenotypes are becoming more feasible and may offer a way to reduce clinical test turnaround times compared to conventional culture-based methods, and in turn, improve patient outcomes. In this study, we use whole genome sequence data from 1668 clinical isolates of Klebsiella pneumoniae to develop a XGBoost-based machine learning model that accurately predicts minimum inhibitory concentrations (MICs) for 20 antibiotics. The overall accuracy of the model, within ±1 two-fold dilution factor, is 92%. Individual accuracies are ≥90% for 15/20 antibiotics. We show that the MICs predicted by the model correlate with known antimicrobial resistance genes. Importantly, the genome-wide approach described in this study offers a way to predict MICs for isolates without knowledge of the underlying gene content. This study shows that machine learning can be used to build a complete in silico MIC prediction panel for K. pneumoniae and provides a framework for building MIC prediction models for other pathogenic bacteria.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Anti-Bacterial Agents / pharmacology*
  • Computer Simulation
  • DNA, Bacterial / genetics
  • Drug Resistance, Multiple, Bacterial
  • Humans
  • Klebsiella Infections / microbiology*
  • Klebsiella pneumoniae / drug effects
  • Klebsiella pneumoniae / genetics*
  • Machine Learning
  • Microbial Sensitivity Tests
  • Models, Theoretical
  • Whole Genome Sequencing / methods*


  • Anti-Bacterial Agents
  • DNA, Bacterial