Full-genome evolutionary analysis of the novel corona virus (2019-nCoV) rejects the hypothesis of emergence as a result of a recent recombination event

Infect Genet Evol. 2020 Apr;79:104212. doi: 10.1016/j.meegid.2020.104212. Epub 2020 Jan 29.

Abstract

Background: A novel coronavirus (2019-nCoV) associated with human to human transmission and severe human infection has been recently reported from the city of Wuhan in China. Our objectives were to characterize the genetic relationships of the 2019-nCoV and to search for putative recombination within the subgenus of sarbecovirus.

Methods: Putative recombination was investigated by RDP4 and Simplot v3.5.1 and discordant phylogenetic clustering in individual genomic fragments was confirmed by phylogenetic analysis using maximum likelihood and Bayesian methods.

Results: Our analysis suggests that the 2019-nCoV although closely related to BatCoV RaTG13 sequence throughout the genome (sequence similarity 96.3%), shows discordant clustering with the Bat_SARS-like coronavirus sequences. Specifically, in the 5'-part spanning the first 11,498 nucleotides and the last 3'-part spanning 24,341-30,696 positions, 2019-nCoV and RaTG13 formed a single cluster with Bat_SARS-like coronavirus sequences, whereas in the middle region spanning the 3'-end of ORF1a, the ORF1b and almost half of the spike regions, 2019-nCoV and RaTG13 grouped in a separate distant lineage within the sarbecovirus branch.

Conclusions: The levels of genetic similarity between the 2019-nCoV and RaTG13 suggest that the latter does not provide the exact variant that caused the outbreak in humans, but the hypothesis that 2019-nCoV has originated from bats is very likely. We show evidence that the novel coronavirus (2019-nCov) is not-mosaic consisting in almost half of its genome of a distinct lineage within the betacoronavirus. These genomic features and their potential association with virus characteristics and virulence in humans need further attention.

Keywords: Genomic sequence analysis; Molecular epidemiology; Novel coronavirus; Origin; Phylogenetic analysis; Recombination.

MeSH terms

  • Betacoronavirus / genetics*
  • Coronavirus Infections / virology
  • Genome, Viral*
  • High-Throughput Nucleotide Sequencing
  • Pandemics
  • Phylogeny*
  • Pneumonia, Viral / virology
  • Recombination, Genetic*

Supplementary concepts

  • COVID-19
  • severe acute respiratory syndrome coronavirus 2