OLGenie: Estimating Natural Selection to Predict Functional Overlapping Genes

Mol Biol Evol. 2020 Aug 1;37(8):2440-2449. doi: 10.1093/molbev/msaa087.

Abstract

Purifying (negative) natural selection is a hallmark of functional biological sequences, and can be detected in protein-coding genes using the ratio of nonsynonymous to synonymous substitutions per site (dN/dS). However, when two genes overlap the same nucleotide sites in different frames, synonymous changes in one gene may be nonsynonymous in the other, perturbing dN/dS. Thus, scalable methods are needed to estimate functional constraint specifically for overlapping genes (OLGs). We propose OLGenie, which implements a modification of the Wei-Zhang method. Assessment with simulations and controls from viral genomes (58 OLGs and 176 non-OLGs) demonstrates low false-positive rates and good discriminatory ability in differentiating true OLGs from non-OLGs. We also apply OLGenie to the unresolved case of HIV-1's putative antisense protein gene, showing significant purifying selection. OLGenie can be used to study known OLGs and to predict new OLGs in genome annotation. Software and example data are freely available at https://github.com/chasewnelson/OLGenie (last accessed April 10, 2020).

Keywords: antisense protein (asp) gene; d N/dS; gene prediction; genome annotation; human immunodeficiency virus-1; open reading frame; overlapping gene (OLG); purifying (negative) selection.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genes, Overlapping*
  • Genetic Techniques*
  • HIV-1 / genetics
  • Selection, Genetic*
  • Silent Mutation*
  • Software*