Physicochemical properties-based hybrid machine learning technique for the prediction of SARS-CoV-2 T-cell epitopes as vaccine targets

PeerJ Comput Sci. 2024 Apr 25:10:e1980. doi: 10.7717/peerj-cs.1980. eCollection 2024.

Abstract

Majority of the existing SARS-CoV-2 vaccines work by presenting the whole pathogen in the attenuated form to immune system to invoke an immune response. On the other hand, the concept of a peptide based vaccine (PBV) is based on the identification and chemical synthesis of only immunodominant peptides known as T-cell epitopes (TCEs) to induce a specific immune response against a particular pathogen. However PBVs have received less attention despite holding huge untapped potential for boosting vaccine safety and immunogenicity. To identify these TCEs for designing PBV, wet-lab experiments are difficult, expensive, and time-consuming. Machine learning (ML) techniques can accurately predict TCEs, saving time and cost for speedy vaccine development. This work proposes novel hybrid ML techniques based on the physicochemical properties of peptides to predict SARS-CoV-2 TCEs. The proposed hybrid ML technique was evaluated using various ML model evaluation metrics and demonstrated promising results. The hybrid technique of decision tree classifier with chi-squared feature weighting technique and forward search optimal feature searching algorithm has been identified as the best model with an accuracy of 98.19%. Furthermore, K-fold cross-validation (KFCV) was performed to ensure that the model is reliable and the results indicate that the hybrid random forest model performs consistently well in terms of accuracy with respect to other hybrid approaches. The predicted TCEs are highly likely to serve as promising vaccine targets, subject to evaluations both in-vivo and in-vitro. This development could potentially save countless lives globally, prevent future epidemic-scale outbreaks, and reduce the risk of mutation escape.

Keywords: COVID-19; Hybrid technique; Machine learning; Peptide based vaccine; SARS-CoV-2; T-cell epitope.

Grants and funding

This work was supported by the Deanship of Scientific Research at King Khalid University (KKU) through the Research Group Program Under the Grant Number: (R.G.P.2/572/44). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.