Comparative analysis of twelve genomes of three novel group 2c and group 2d coronaviruses reveals unique group and subgroup features

J Virol. 2007 Feb;81(4):1574-85. doi: 10.1128/JVI.02182-06. Epub 2006 Nov 22.


Twelve complete genomes of three novel coronaviruses-bat coronavirus HKU4 (bat-CoV HKU4), bat-CoV HKU5 (putative group 2c), and bat-CoV HKU9 (putative group 2d)-were sequenced. Comparative genome analysis showed that the various open reading frames (ORFs) of the genomes of the three coronaviruses had significantly higher amino acid identities to those of other group 2 coronaviruses than group 1 and 3 coronaviruses. Phylogenetic trees constructed using chymotrypsin-like protease, RNA-dependent RNA polymerase, helicase, spike, and nucleocapsid all showed that the group 2a and 2b and putative group 2c and 2d coronaviruses are more closely related to each other than to group 1 and 3 coronaviruses. Unique genomic features distinguishing between these four subgroups, including the number of papain-like proteases, the presence or absence of hemagglutinin esterase, small ORFs between the membrane and nucleocapsid genes and ORFs (NS7a and NS7b), bulged stem-loop and pseudoknot structures downstream of the nucleocapsid gene, transcription regulatory sequence, and ribosomal recognition signal for the envelope gene, were also observed. This is the first time that NS7a and NS7b downstream of the nucleocapsid gene has been found in a group 2 coronavirus. The high Ka/Ks ratio of NS7a and NS7b in bat-CoV HKU9 implies that these two group 2d-specific genes are under high selective pressure and hence are rapidly evolving. The four subgroups of group 2 coronaviruses probably originated from a common ancestor. Further molecular epidemiological studies on coronaviruses in the bats of other countries, as well as in other animals, and complete genome sequencing will shed more light on coronavirus diversity and their evolutionary histories.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Chiroptera / virology
  • Coronaviridae / classification*
  • Coronaviridae / genetics*
  • Coronaviridae / isolation & purification
  • Genome, Viral*
  • Molecular Sequence Data
  • Nucleic Acid Conformation
  • Open Reading Frames / genetics*
  • Phylogeny
  • RNA, Viral / chemistry
  • Sequence Alignment
  • Species Specificity


  • RNA, Viral

Associated data

  • GENBANK/EF065505
  • GENBANK/EF065506
  • GENBANK/EF065507
  • GENBANK/EF065508
  • GENBANK/EF065509
  • GENBANK/EF065510
  • GENBANK/EF065511
  • GENBANK/EF065512
  • GENBANK/EF065513
  • GENBANK/EF065514
  • GENBANK/EF065515
  • GENBANK/EF065516