Coronavirus genomes carry the signatures of their habitats

PLoS One. 2020 Dec 22;15(12):e0244025. doi: 10.1371/journal.pone.0244025. eCollection 2020.


Coronaviruses such as SARS-CoV-2 regularly infect host tissues that express antiviral proteins (AVPs) in abundance. Understanding how they evolve to adapt or evade host immune responses is important in the effort to control the spread of infection. Two AVPs that may shape viral genomes are the zinc finger antiviral protein (ZAP) and the apolipoprotein B mRNA editing enzyme-catalytic polypeptide-like 3 (APOBEC3). The former binds to CpG dinucleotides to facilitate the degradation of viral transcripts while the latter frequently deaminates C into U residues which could generate notable viral sequence variations. We tested the hypothesis that both APOBEC3 and ZAP impose selective pressures that shape the genome of an infecting coronavirus. Our investigation considered a comprehensive number of publicly available genomes for seven coronaviruses (SARS-CoV-2, SARS-CoV, and MERS infecting Homo sapiens, Bovine CoV infecting Bos taurus, MHV infecting Mus musculus, HEV infecting Sus scrofa, and CRCoV infecting Canis lupus familiaris). We show that coronaviruses that regularly infect tissues with abundant AVPs have CpG-deficient and U-rich genomes; whereas those that do not infect tissues with abundant AVPs do not share these sequence hallmarks. Among the coronaviruses surveyed herein, CpG is most deficient in SARS-CoV-2 and a temporal analysis showed a marked increase in C to U mutations over four months of SARS-CoV-2 genome evolution. Furthermore, the preferred motifs in which these C to U mutations occur are the same as those subjected to APOBEC3 editing in HIV-1. These results suggest that both ZAP and APOBEC3 shape the SARS-CoV-2 genome: ZAP imposes a strong CpG avoidance, and APOBEC3 constantly edits C to U. Evolutionary pressures exerted by host immune systems onto viral genomes may motivate novel strategies for SARS-CoV-2 vaccine development.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • APOBEC Deaminases
  • Animals
  • COVID-19 / genetics*
  • COVID-19 / pathology
  • COVID-19 / virology
  • Cattle
  • Coronavirus / classification
  • Coronavirus / genetics*
  • Coronavirus / pathogenicity
  • Cytidine Deaminase / genetics*
  • Dogs
  • Evolution, Molecular
  • Genome, Viral / genetics
  • Humans
  • Mice
  • Middle East Respiratory Syndrome Coronavirus / genetics
  • Middle East Respiratory Syndrome Coronavirus / pathogenicity
  • RNA-Binding Proteins / genetics*
  • Repressor Proteins / genetics*
  • SARS-CoV-2 / genetics
  • SARS-CoV-2 / pathogenicity
  • Severe acute respiratory syndrome-related coronavirus / genetics
  • Severe acute respiratory syndrome-related coronavirus / pathogenicity
  • Swine / virology


  • RNA-Binding Proteins
  • Repressor Proteins
  • YLPM1 protein, human
  • APOBEC Deaminases
  • APOBEC3 proteins, human
  • Cytidine Deaminase

Grants and funding

This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC, Discovery Grant to X.X. [RGPIN/2018-03878], and NSERC Doctoral Scholarship to Y.W. [CGSD/2019-535291]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.