Comparative genome analysis of novel coronavirus (SARS-CoV-2) from different geographical locations and the effect of mutations on major target proteins: An in silico insight

PLoS One. 2020 Sep 3;15(9):e0238344. doi: 10.1371/journal.pone.0238344. eCollection 2020.


A novel severe acute respiratory syndrome-related coronavirus-2 (SARS-CoV-2) causing COVID-19 pandemic in humans, recently emerged and has exported in more than 200 countries as a result of rapid spread. In this study, we have made an attempt to investigate the SARS-CoV-2 genome reported from 13 different countries, identification of mutations in major coronavirus proteins of these different SARS-CoV-2 genomes and compared with SARS-CoV. These thirteen complete genome sequences of SARS-CoV-2 showed high identity (>99%) to each other, while they shared 82% identity with SARS-CoV. Here, we performed a very systematic mutational analysis of SARS-CoV-2 genomes from different geographical locations, which enabled us to identify numerous unique features of this viral genome. This includes several important country-specific unique mutations in the major proteins of SARS-CoV-2 namely, replicase polyprotein, spike glycoprotein, envelope protein and nucleocapsid protein. Indian strain showed mutation in spike glycoprotein at R408I and in replicase polyprotein at I671T, P2144S and A2798V,. While the spike protein of Spain & South Korea carried F797C and S221W mutation, respectively. Likewise, several important country specific mutations were analyzed. The effect of mutations of these major proteins were also investigated using various in silico approaches. Main protease (Mpro), the therapeutic target protein of SARS with maximum reported inhibitors, was thoroughly investigated and the effect of mutation on the binding affinity and structural dynamics of Mpro was studied. It was found that the R60C mutation in Mpro affects the protein dynamics, thereby, affecting the binding of inhibitor within its active site. The implications of mutation on structural characteristics were determined. The information provided in this manuscript holds great potential in further scientific research towards the design of potential vaccine candidates/small molecular inhibitor against COVID19.

MeSH terms

  • Betacoronavirus / classification
  • Betacoronavirus / genetics*
  • Coronavirus 3C Proteases
  • Coronavirus Envelope Proteins
  • Coronavirus Nucleocapsid Proteins
  • Cysteine Endopeptidases / chemistry
  • Cysteine Endopeptidases / genetics*
  • Genetic Variation
  • Genome, Viral*
  • Molecular Dynamics Simulation
  • Mutation*
  • Nucleocapsid Proteins / chemistry
  • Nucleocapsid Proteins / genetics*
  • Phosphoproteins
  • Phylogeny
  • SARS-CoV-2
  • Spike Glycoprotein, Coronavirus / chemistry
  • Spike Glycoprotein, Coronavirus / genetics*
  • Viral Envelope Proteins / chemistry
  • Viral Envelope Proteins / genetics*
  • Viral Nonstructural Proteins / chemistry
  • Viral Nonstructural Proteins / genetics*


  • Coronavirus Envelope Proteins
  • Coronavirus Nucleocapsid Proteins
  • Nucleocapsid Proteins
  • Phosphoproteins
  • Spike Glycoprotein, Coronavirus
  • Viral Envelope Proteins
  • Viral Nonstructural Proteins
  • envelope protein, SARS-CoV-2
  • nucleocapsid phosphoprotein, SARS-CoV-2
  • spike protein, SARS-CoV-2
  • Cysteine Endopeptidases
  • Coronavirus 3C Proteases

Grant support

The authors received no specific funding for this work.