Quantifying the nativeness of antibody sequences using long short-term memory networks

Protein Eng Des Sel. 2019 Dec 31;32(7):347-354. doi: 10.1093/protein/gzz031.


Antibodies often undergo substantial engineering en route to the generation of a therapeutic candidate with good developability properties. Characterization of antibody libraries has shown that retaining native-like sequence improves the overall quality of the library. Motivated by recent advances in deep learning, we developed a bi-directional long short-term memory (LSTM) network model to make use of the large amount of available antibody sequence information, and use this model to quantify the nativeness of antibody sequences. The model scores sequences for their similarity to naturally occurring antibodies, which can be used as a consideration during design and engineering of libraries. We demonstrate the performance of this approach by training a model on human antibody sequences and show that our method outperforms other approaches at distinguishing human antibodies from those of other species. We show the applicability of this method for the evaluation of synthesized antibody libraries and humanization of mouse antibodies.

Keywords: antibody engineering; antibody humanization; long short-term memory network; machine learning.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Antibodies / chemistry*
  • Antibodies / immunology
  • Computational Biology*
  • Humans


  • Antibodies