OverGeneDB: a database of 5' end protein coding overlapping genes in human and mouse genomes

Nucleic Acids Res. 2018 Jan 4;46(D1):D186-D193. doi: 10.1093/nar/gkx948.


Gene overlap plays various regulatory functions on transcriptional and post-transcriptional levels. Most current studies focus on protein-coding genes overlapping with non-protein-coding counterparts, the so called natural antisense transcripts. Considerably less is known about the role of gene overlap in the case of two protein-coding genes. Here, we provide OverGeneDB, a database of human and mouse 5' end protein-coding overlapping genes. The database contains 582 human and 113 mouse gene pairs that are transcribed using overlapping promoters in at least one analyzed library. Gene pairs were identified based on the analysis of the transcription start site (TSS) coordinates in 73 human and 10 mouse organs, tissues and cell lines. Beside TSS data, resources for 26 human lung adenocarcinoma cell lines also contain RNA-Seq and ChIP-Seq data for seven histone modifications and RNA Polymerase II activity. The collected data revealed that the overlap region is rarely conserved between the studied species and tissues. In ∼50% of the overlapping genes, transcription started explicitly in the overlap regions. In the remaining half of overlapping genes, transcription was initiated both from overlapping and non-overlapping TSSs. OverGeneDB is accessible at http://overgenedb.amu.edu.pl.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Databases, Genetic*
  • Gene Expression
  • Genes, Overlapping*
  • Histone Code
  • Humans
  • Mice
  • Multigene Family
  • Open Reading Frames
  • Promoter Regions, Genetic
  • RNA Polymerase II / metabolism
  • Sequence Analysis, RNA
  • Transcription Factors / metabolism
  • Transcription Initiation Site


  • Transcription Factors
  • RNA Polymerase II