Predicting antigen specificity of single T cells based on TCR CDR3 regions

Mol Syst Biol. 2020 Aug;16(8):e9416. doi: 10.15252/msb.20199416.


It has recently become possible to simultaneously assay T-cell specificity with respect to large sets of antigens and the T-cell receptor sequence in high-throughput single-cell experiments. Leveraging this new type of data, we propose and benchmark a collection of deep learning architectures to model T-cell specificity in single cells. In agreement with previous results, we found that models that treat antigens as categorical outcome variables outperform those that model the TCR and antigen sequence jointly. Moreover, we show that variability in single-cell immune repertoire screens can be mitigated by modeling cell-specific covariates. Lastly, we demonstrate that the number of bound pMHC complexes can be predicted in a continuous fashion providing a gateway to disentangle cell-to-dextramer binding strength and receptor-to-pMHC affinity. We provide these models in the Python package TcellMatch to allow imputation of antigen specificities in single-cell RNA-seq studies on T cells without the need for MHC staining.

Keywords: T-cell receptors; antigen specificity; multimodal; single cell; supervised learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Computational Biology / methods*
  • Deep Learning
  • Histocompatibility Antigens / genetics
  • Histocompatibility Antigens / metabolism*
  • Humans
  • Receptor-CD3 Complex, Antigen, T-Cell / genetics
  • Receptor-CD3 Complex, Antigen, T-Cell / metabolism*
  • Sequence Analysis, RNA
  • Single-Cell Analysis / methods*
  • Supervised Machine Learning
  • T-Lymphocytes / immunology*


  • Histocompatibility Antigens
  • Receptor-CD3 Complex, Antigen, T-Cell