Minimizing Taxonomic and Natural Product Redundancy in Microbial Libraries Using MALDI-TOF MS and the Bioinformatics Pipeline IDBac

J Nat Prod. 2019 Aug 23;82(8):2167-2173. doi: 10.1021/acs.jnatprod.9b00168. Epub 2019 Jul 23.


Libraries of microorganisms have been a cornerstone of drug discovery efforts since the mid-1950s, but strain duplication in some libraries has resulted in unwanted natural product redundancy. In the current study, we implemented a workflow that minimizes both the natural product overlap and the total number of bacterial isolates in a library. Using a collection expedition to Iceland as an example, we purified every distinct bacterial colony off isolation plates derived from 86 environmental samples. We employed our mass spectrometry (MS)-based IDBac workflow on these isolates to form groups of taxa based on protein MS fingerprints (3-15 kDa) and further distinguished taxa subgroups based on their degree of overlap within corresponding natural product spectra (0.2-2 kDa). This informed the decision to create a library of 301 isolates spanning 54 genera. This process required only 25 h of data acquisition and 2 h of analysis. In a separate experiment, we reduced the size of an existing library based on the degree of metabolic overlap observed in natural product MS spectra of bacterial colonies (from 833 to 233 isolates, a 72.0% size reduction). Overall, our pipeline allows for a significant reduction in costs associated with library generation and minimizes natural product redundancy entering into downstream biological screening efforts.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteria / chemistry*
  • Biological Products / chemistry*
  • Biological Products / pharmacology
  • Computational Biology*
  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization / methods*


  • Biological Products