Machine Learning-Guided Prediction of Antigen-Reactive In Silico Clonotypes Based on Changes in Clonal Abundance through Bio-Panning

Biomolecules. 2020 Mar 8;10(3):421. doi: 10.3390/biom10030421.


c-Met is a promising target in cancer therapy for its intrinsic oncogenic properties. However, there are currently no c-Met-specific inhibitors available in the clinic. Antibodies blocking the interaction with its only known ligand, hepatocyte growth factor, and/or inducing receptor internalization have been clinically tested. To explore other therapeutic antibody mechanisms like Fc-mediated effector function, bispecific T cell engagement, and chimeric antigen T cell receptors, a diverse panel of antibodies is essential. We prepared a chicken immune scFv library, performed four rounds of bio-panning, obtained 641 clones using a high-throughput clonal retrieval system (TrueRepertoireTM, TR), and found 149 antigen-reactive scFv clones. We also prepared phagemid DNA before the start of bio-panning (round 0) and, after each round of bio-panning (round 1-4), performed next-generation sequencing of these five sets of phagemid DNA, and identified 860,207 HCDR3 clonotypes and 443,292 LCDR3 clonotypes along with their clonal abundance data. We then established a TR data set consisting of antigen reactivity for scFv clones found in TR analysis and the clonal abundance of their HCDR3 and LCDR3 clonotypes in five sets of phagemid DNA. Using the TR data set, a random forest machine learning algorithm was trained to predict the binding properties of in silico HCDR3 and LCDR3 clonotypes. Subsequently, we synthesized 40 HCDR3 and 40 LCDR3 clonotypes predicted to be antigen reactive (AR) and constructed a phage-displayed scFv library called the AR library. In parallel, we also prepared an antigen non-reactive (NR) library using 10 HCDR3 and 10 LCDR3 clonotypes predicted to be NR. After a single round of bio-panning, we screened 96 randomly-selected phage clones from the AR library and found out 14 AR scFv clones consisting of 5 HCDR3 and 11 LCDR3 AR clonotypes. We also screened 96 randomly-selected phage clones from the NR library, but did not identify any AR clones. In summary, machine learning algorithms can provide a method for identifying AR antibodies, which allows for the characterization of diverse antibody libraries inaccessible by traditional methods.

Keywords: antibody discovery; c-Met; machine learning; next-generation sequencing; phage display; random forest.

Publication types

  • Research Support, Non-U.S. Gov't