Conflicting and ambiguous names of overlapping ORFs in the SARS-CoV-2 genome: A homology-based resolution

Virology. 2021 Jun;558:145-151. doi: 10.1016/j.virol.2021.02.013. Epub 2021 Mar 17.

Abstract

At least six small alternative-frame open reading frames (ORFs) overlapping well-characterized SARS-CoV-2 genes have been hypothesized to encode accessory proteins. Researchers have used different names for the same ORF or the same name for different ORFs, resulting in erroneous homological and functional inferences. We propose standard names for these ORFs and their shorter isoforms, developed in consultation with the Coronaviridae Study Group of the International Committee on Taxonomy of Viruses. We recommend calling the 39 codon Spike-overlapping ORF ORF2b; the 41, 57, and 22 codon ORF3a-overlapping ORFs ORF3c, ORF3d, and ORF3b; the 33 codon ORF3d isoform ORF3d-2; and the 97 and 73 codon Nucleocapsid-overlapping ORFs ORF9b and ORF9c. Finally, we document conflicting usage of the name ORF3b in 32 studies, and consequent erroneous inferences, stressing the importance of reserving identical names for homologs. We recommend that authors referring to these ORFs provide lengths and coordinates to minimize ambiguity caused by prior usage of alternative names.

Keywords: Accessory protein; Alternative reading frame; Nomenclature; ORF2b; ORF3b; ORF3c; ORF3d; ORF9a; ORF9b; Open reading frame; Overlapping ORF; SARS-CoV-2.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Open Reading Frames*
  • SARS-CoV-2 / genetics*
  • SARS-CoV-2 / immunology
  • Spike Glycoprotein, Coronavirus* / classification
  • Spike Glycoprotein, Coronavirus* / genetics
  • Terminology as Topic*

Substances

  • Spike Glycoprotein, Coronavirus
  • spike protein, SARS-CoV-2