COVID-19 vaccine design using reverse and structural vaccinology, ontology-based literature mining and machine learning

Brief Bioinform. 2022 Jul 18;23(4):bbac190. doi: 10.1093/bib/bbac190.


Rational vaccine design, especially vaccine antigen identification and optimization, is critical to successful and efficient vaccine development against various infectious diseases including coronavirus disease 2019 (COVID-19). In general, computational vaccine design includes three major stages: (i) identification and annotation of experimentally verified gold standard protective antigens through literature mining, (ii) rational vaccine design using reverse vaccinology (RV) and structural vaccinology (SV) and (iii) post-licensure vaccine success and adverse event surveillance and its usage for vaccine design. Protegen is a database of experimentally verified protective antigens, which can be used as gold standard data for rational vaccine design. RV predicts protective antigen targets primarily from genome sequence analysis. SV refines antigens through structural engineering. Recently, RV and SV approaches, with the support of various machine learning methods, have been applied to COVID-19 vaccine design. The analysis of post-licensure vaccine adverse event report data also provides valuable results in terms of vaccine safety and how vaccines should be used or paused. Ontology standardizes and incorporates heterogeneous data and knowledge in a human- and computer-interpretable manner, further supporting machine learning and vaccine design. Future directions on rational vaccine design are discussed.

Keywords: COVID-19; machine learning; ontology; reverse vaccinology; structural vaccinology.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19 Vaccines
  • COVID-19* / prevention & control
  • Data Mining
  • Humans
  • Machine Learning
  • Vaccines* / chemistry
  • Vaccines* / genetics
  • Vaccinology / methods


  • COVID-19 Vaccines
  • Vaccines