Global analysis of adenylate-forming enzymes reveals β-lactone biosynthesis pathway in pathogenic Nocardia

J Biol Chem. 2020 Oct 30;295(44):14826-14839. doi: 10.1074/jbc.RA120.013528. Epub 2020 Aug 21.


Enzymes that cleave ATP to activate carboxylic acids play essential roles in primary and secondary metabolism in all domains of life. Class I adenylate-forming enzymes share a conserved structural fold but act on a wide range of substrates to catalyze reactions involved in bioluminescence, nonribosomal peptide biosynthesis, fatty acid activation, and β-lactone formation. Despite their metabolic importance, the substrates and functions of the vast majority of adenylate-forming enzymes are unknown without tools available to accurately predict them. Given the crucial roles of adenylate-forming enzymes in biosynthesis, this also severely limits our ability to predict natural product structures from biosynthetic gene clusters. Here we used machine learning to predict adenylate-forming enzyme function and substrate specificity from protein sequences. We built a web-based predictive tool and used it to comprehensively map the biochemical diversity of adenylate-forming enzymes across >50,000 candidate biosynthetic gene clusters in bacterial, fungal, and plant genomes. Ancestral phylogenetic reconstruction and sequence similarity networking of enzymes from these clusters suggested divergent evolution of the adenylate-forming superfamily from a core enzyme scaffold most related to contemporary CoA ligases toward more specialized functions including β-lactone synthetases. Our classifier predicted β-lactone synthetases in uncharacterized biosynthetic gene clusters conserved in >90 different strains of Nocardia. To test our prediction, we purified a candidate β-lactone synthetase from Nocardia brasiliensis and reconstituted the biosynthetic pathway in vitro to link the gene cluster to the β-lactone natural product, nocardiolactone. We anticipate that our machine learning approach will aid in functional classification of enzymes and advance natural product discovery.

Keywords: Nocardia; acetyl-CoA synthetase; adenylate-forming enzymes; bioinformatics; coenzyme A (CoA); enzyme catalysis; machine learning; natural product biosynthesis; substrate specificity; β-lactone synthetases.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Adenosine Monophosphate / biosynthesis*
  • Catalysis
  • Lactones / metabolism*
  • Ligases / genetics
  • Ligases / metabolism*
  • Machine Learning
  • Multigene Family
  • Nocardia / enzymology
  • Nocardia / metabolism*
  • Phylogeny
  • Reproducibility of Results
  • Substrate Specificity


  • Lactones
  • Adenosine Monophosphate
  • Ligases

Supplementary concepts

  • Nocardia brasiliensis