Large scale active-learning-guided exploration for in vitro protein production optimization

Nat Commun. 2020 Apr 20;11(1):1872. doi: 10.1038/s41467-020-15798-5.


Lysate-based cell-free systems have become a major platform to study gene expression but batch-to-batch variation makes protein production difficult to predict. Here we describe an active learning approach to explore a combinatorial space of ~4,000,000 cell-free buffer compositions, maximizing protein production and identifying critical parameters involved in cell-free productivity. We also provide a one-step-method to achieve high quality predictions for protein production using minimal experimental effort regardless of the lysate quality.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteria / metabolism
  • Cell-Free System
  • Gene Expression
  • Machine Learning
  • Protein Biosynthesis*
  • Proteins / metabolism*
  • Synthetic Biology


  • Proteins