Natural selection plays a significant role in governing the codon usage bias in the novel SARS-CoV-2 variants of concern (VOC)

PeerJ. 2022 Jun 23;10:e13562. doi: 10.7717/peerj.13562. eCollection 2022.


The ongoing prevailing COVID-19 pandemic caused by SARS-CoV-2 is becoming one of the major global health concerns worldwide. The SARS-CoV-2 genome encodes spike (S) glycoprotein that plays a very crucial role in viral entry into the host cell via binding of its receptor binding domain (RBD) to the host angiotensin converting enzyme 2 (ACE2) receptor. The continuously evolving SARS-CoV-2 genome results in more severe and transmissible variants characterized by the emergence of novel mutations called 'variants of concern' (VOC). The currently designated alpha, beta, gamma, delta and omicron VOC are the focus of this study due to their high transmissibility, increased virulence, and concerns for decreased effectiveness of the available vaccines. In VOC, the spike (S) gene and other non-structural protein mutations may affect the efficacies of the approved COVID-19 vaccines. To understand the diversity of SARS-CoV-2, several studies have been performed on a limited number of sequences. However, only a few studies have focused on codon usage bias (CUBs) pattern analysis of all the VOC strains. Therefore, to evaluate the evolutionary divergence of all VOC S-genes, we performed CUBs analysis on 300,354 sequences to understand the evolutionary relationship with its adaptation in different hosts, i.e., humans, bats, and pangolins. Base composition and RSCU analysis revealed the presence of 20 preferred AU-ended and 10 under-preferred GC-ended codons. In addition, CpG was found to be depleted, which may be attributable to the adaptive response by viruses to escape from the host defense process. Moreover, the ENC values revealed a higher bias in codon usage in the VOC S-gene. Further, the neutrality plot analysis demonstrated that S-genes analyzed in this study are under 83.93% influence of natural selection, suggesting its pivotal role in shaping the CUBs. The CUBs pattern of S-genes was found to be very similar among all the VOC strains. Interestingly, we observed that VOC strains followed a trend of antagonistic codon usage with respect to the human host. The identified CUBs divergence would help to understand the virus evolution and its host adaptation, thus help design novel vaccine strategies against the emerging VOC strains. To the best of our knowledge, this is the first report for identifying the evolution of CUBs pattern in all the currently identified VOC.

Keywords: Codon usage bias; Mutational pressure; Natural selection; SARS-CoV-2; Variants of concern (VOC).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • COVID-19 Vaccines
  • COVID-19* / epidemiology
  • Chiroptera* / genetics
  • Codon Usage
  • Humans
  • Pandemics
  • SARS-CoV-2 / genetics
  • Selection, Genetic


  • COVID-19 Vaccines

Supplementary concepts

  • SARS-CoV-2 variants

Grant support

The work was supported by the funding to the Translational Bioinformatics Group at ICGEB by the Department of Biotechnology (Grant Number BT/PR40151/BTIS/137/5/2021). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.