The current outbreak of coronavirus disease-2019 (COVID-19) caused by SARS-CoV-2 poses unparalleled challenges to global public health. SARS-CoV-2 is a Betacoronavirus, one of four genera belonging to the Coronaviridae subfamily Orthocoronavirinae. Coronaviridae, in turn, are members of the order Nidovirales, a group of enveloped, positive-stranded RNA viruses. Here we present a systematic phylogenetic and evolutionary study based on protein domain architecture, encompassing the entire proteomes of all Orthocoronavirinae, as well as other Nidovirales. This analysis has revealed that the genomic evolution of Nidovirales is associated with extensive gains and losses of protein domains. In Orthocoronavirinae, the sections of the genomes that show the largest divergence in protein domains are found in the proteins encoded in the amino-terminal end of the polyprotein (PP1ab), the spike protein (S), and many of the accessory proteins. The diversity among the accessory proteins is particularly striking, as each subgenus possesses a set of accessory proteins that is almost entirely specific to that subgenus. The only notable exception to this is ORF3b, which is present and orthologous over all Alphacoronaviruses. In contrast, the membrane protein (M), envelope small membrane protein (E), nucleoprotein (N), as well as proteins encoded in the central and carboxy-terminal end of PP1ab (such as the 3C-like protease, RNA-dependent RNA polymerase, and Helicase) show stable domain architectures across all Orthocoronavirinae. This comprehensive analysis of the Coronaviridae domain architecture has important implication for efforts to develop broadly cross-protective coronavirus vaccines.
Keywords: Coronaviridae; Evolution; Genome; Hidden Markov models; Nidovirales; Orthocoronavirinae; Phylogenetics; Phylogenomics; Protein domains.
Copyright © 2022 The Authors. Published by Elsevier Inc. All rights reserved.