Chapter 8. Methods for in silico prediction of microbial polyketide and nonribosomal peptide biosynthetic pathways from DNA sequence data

Methods Enzymol. 2009;458:181-217. doi: 10.1016/S0076-6879(09)04808-3.


Fore-knowledge of the secondary metabolic potential of cultivated and previously uncultivated microorganisms can potentially facilitate the process of natural product discovery. By combining sequence-based knowledge with biochemical precedent, translated gene sequence data can be used to rapidly derive structural elements encoded by secondary metabolic gene clusters from microorganisms. These structural elements provide an estimate of the secondary metabolic potential of a given organism and a starting point for identification of potential lead compounds in isolation/structure elucidation campaigns. The accuracy of these predictions for a given translated gene sequence depends on the biochemistry of the metabolite class, similarity to known metabolite gene clusters, and depth of knowledge concerning its biosynthetic machinery. This chapter introduces methods for prediction of structural elements for two well-studied classes: modular polyketides and nonribosomally encoded peptides. A bioinformatics tool is presented for rapid preliminary analysis of these modular systems, and prototypical methods for converting these analyses into substructural elements are described.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Computational Biology / methods
  • Macrolides / metabolism*
  • Molecular Sequence Data
  • Molecular Structure
  • Peptide Biosynthesis / genetics
  • Peptide Biosynthesis / physiology*
  • Signal Transduction / genetics
  • Signal Transduction / physiology*


  • Macrolides