COVID-19 Coronavirus Vaccine Design Using Reverse Vaccinology and Machine Learning

Front Immunol. 2020 Jul 3;11:1581. doi: 10.3389/fimmu.2020.01581. eCollection 2020.


To ultimately combat the emerging COVID-19 pandemic, it is desired to develop an effective and safe vaccine against this highly contagious disease caused by the SARS-CoV-2 coronavirus. Our literature and clinical trial survey showed that the whole virus, as well as the spike (S) protein, nucleocapsid (N) protein, and membrane (M) protein, have been tested for vaccine development against SARS and MERS. However, these vaccine candidates might lack the induction of complete protection and have safety concerns. We then applied the Vaxign and the newly developed machine learning-based Vaxign-ML reverse vaccinology tools to predict COVID-19 vaccine candidates. Our Vaxign analysis found that the SARS-CoV-2 N protein sequence is conserved with SARS-CoV and MERS-CoV but not from the other four human coronaviruses causing mild symptoms. By investigating the entire proteome of SARS-CoV-2, six proteins, including the S protein and five non-structural proteins (nsp3, 3CL-pro, and nsp8-10), were predicted to be adhesins, which are crucial to the viral adhering and host invasion. The S, nsp3, and nsp8 proteins were also predicted by Vaxign-ML to induce high protective antigenicity. Besides the commonly used S protein, the nsp3 protein has not been tested in any coronavirus vaccine studies and was selected for further investigation. The nsp3 was found to be more conserved among SARS-CoV-2, SARS-CoV, and MERS-CoV than among 15 coronaviruses infecting human and other animals. The protein was also predicted to contain promiscuous MHC-I and MHC-II T-cell epitopes, and the predicted linear B-cell epitopes were found to be localized on the surface of the protein. Our predicted vaccine targets have the potential for effective and safe COVID-19 vaccine development. We also propose that an "Sp/Nsp cocktail vaccine" containing a structural protein(s) (Sp) and a non-structural protein(s) (Nsp) would stimulate effective complementary immune responses.

Keywords: COVID-19; S protein; machine learning; non-structural protein 3; reverse vaccinology; vaccine; vaxign; vaxign-ML.

Publication types

  • Research Support, N.I.H., Extramural
  • Systematic Review

MeSH terms

  • Animals
  • Betacoronavirus* / genetics
  • Betacoronavirus* / immunology
  • COVID-19
  • COVID-19 Vaccines
  • Coronavirus Infections* / epidemiology
  • Coronavirus Infections* / genetics
  • Coronavirus Infections* / immunology
  • Coronavirus Infections* / prevention & control
  • Epitopes, B-Lymphocyte / genetics
  • Epitopes, B-Lymphocyte / immunology
  • Humans
  • Immunogenicity, Vaccine
  • Machine Learning*
  • Middle East Respiratory Syndrome Coronavirus / genetics
  • Middle East Respiratory Syndrome Coronavirus / immunology
  • Pandemics* / prevention & control
  • Pneumonia, Viral* / epidemiology
  • Pneumonia, Viral* / genetics
  • Pneumonia, Viral* / immunology
  • Pneumonia, Viral* / prevention & control
  • SARS-CoV-2
  • Viral Proteins / genetics
  • Viral Proteins / immunology
  • Viral Vaccines* / genetics
  • Viral Vaccines* / immunology


  • COVID-19 Vaccines
  • Epitopes, B-Lymphocyte
  • Viral Proteins
  • Viral Vaccines