The field of synthetic biology aims to make the design of biological systems predictable, shrinking the huge design space to practical numbers for testing. When designing microbial cell factories, most optimization efforts have focused on enzyme and strain selection/engineering, pathway regulation, and process development. In silico tools for the predictive design of bacterial ribosome binding sites (RBSs) and RBS libraries now allow translational tuning of biochemical pathways; however, methods for predicting optimal RBS combinations in multigene pathways are desirable. Here we present the implementation of machine learning algorithms to model the RBS sequence-phenotype relationship from representative subsets of large combinatorial RBS libraries allowing the accurate prediction of optimal high-producers. Applied to a recombinant monoterpenoid production pathway in Escherichia coli, our approach was able to boost production titers by over 60% when screening under 3% of a library. To facilitate library screening, a multiwell plate fermentation procedure was developed, allowing increased screening throughput with sufficient resolution to discriminate between high and low producers. High producers from one library did not translate during scale-up, but the reduced screening requirements allowed rapid rescreening at the larger scale. This methodology is potentially compatible with any biochemical pathway and provides a powerful tool toward predictive design of bacterial production chassis.
Keywords: machine learning; pathway engineering; ribosome binding site; synthetic biology; terpenoids; translational tuning.