Ultra-high-diversity factorizable libraries for efficient therapeutic discovery

Genome Res. 2022 Jun 23;32(9):1787-1794. doi: 10.1101/gr.276593.122. Online ahead of print.

Abstract

The successful discovery of novel biological therapeutics by selection requires highly diverse libraries of candidate sequences that contain a high proportion of desirable candidates. Here we propose the use of computationally designed factorizable libraries made of concatenated segment libraries as a method of creating large libraries that meet an objective function at low cost. We show that factorizable libraries can be designed efficiently by representing objective functions that describe sequence optimality as an inner product of feature vectors, which we use to design an optimization method we call stochastically annealed product spaces (SAPS). We then use this approach to design diverse and efficient libraries of antibody CDR-H3 sequences with various optimized characteristics.