Properties and abundance of overlapping genes in viruses

Virus Evol. 2020 Feb 13;6(1):veaa009. doi: 10.1093/ve/veaa009. eCollection 2020 Jan.

Abstract

Overlapping genes are commonplace in viruses and play an important role in their function and evolution. However, aside from studies on specific groups of viruses, relatively little is known about the extent and nature of gene overlap and its determinants in viruses as a whole. Here, we present an extensive characterisation of gene overlap in viruses through an analysis of reference genomes present in the NCBI virus genome database. We find that over half the instances of gene overlap are very small, covering <10 nt, and 84 per cent are <50 nt in length. Despite this, 53 per cent of all viruses still contained a gene overlap of 50 nt or larger. We also investigate several predictors of gene overlap such as genome structure (single- and double-stranded RNA and DNA), virus family, genome length, and genome segmentation. This revealed that gene overlap occurs more frequently in DNA viruses than in RNA viruses, and more frequently in single-stranded viruses than in double-stranded viruses. Genome segmentation is also associated with gene overlap, particularly in single-stranded DNA viruses. Notably, we observed a large range of overlap frequencies across families of all genome types, suggesting that it is a common evolutionary trait that provides flexible genome structures in all virus families.

Keywords: meta data; overlapping genes; overprinted genes; reference genomes.