Novel mechanism of conjoined gene formation in the human genome

Funct Integr Genomics. 2012 Mar;12(1):45-61. doi: 10.1007/s10142-011-0260-1. Epub 2012 Jan 10.

Abstract

Recently, conjoined genes (CGs) have emerged as important genetic factors necessary for understanding the human genome. However, their formation mechanism and precise structures have remained mysterious. Based on a detailed structural analysis of 57 human CG transcript variants (CGTVs, discovered in this study) and all (833) known CGs in the human genome, we discovered that the poly(A) signal site from the upstream parent gene region is completely removed via the skipping or truncation of the final exon; consequently, CG transcription is terminated at the poly(A) signal site of the downstream parent gene. This result led us to propose a novel mechanism of CG formation: the complete removal of the poly(A) signal site from the upstream parent gene is a prerequisite for the CG transcriptional machinery to continue transcribing uninterrupted into the intergenic region and downstream parent gene. The removal of the poly(A) signal sequence from the upstream gene region appears to be caused by a deletion or truncation mutation in the human genome rather than post-transcriptional trans-splicing events. With respect to the characteristics of CG sequence structures, we found that intergenic regions are hot spots for novel exon creation during CGTV formation and that exons farther from the intergenic regions are more highly conserved in the CGTVs. Interestingly, many novel exons newly created within the intergenic and intragenic regions originated from transposable element sequences. Additionally, the CGTVs showed tumor tissue-biased expression. In conclusion, our study provides novel insights into the CG formation mechanism and expands the present concepts of the genetic structural landscape, gene regulation, and gene formation mechanisms in the human genome.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • 3' Untranslated Regions
  • Alternative Splicing
  • Base Sequence
  • Cloning, Molecular
  • Exons*
  • Genome, Human*
  • HEK293 Cells
  • Humans
  • Mutagenesis*
  • Mutant Chimeric Proteins / genetics*
  • Mutant Chimeric Proteins / metabolism
  • Neoplasms / metabolism
  • Polyadenylation
  • RNA, Messenger / genetics
  • Reverse Transcriptase Polymerase Chain Reaction
  • Sequence Deletion
  • Transcription, Genetic

Substances

  • 3' Untranslated Regions
  • Mutant Chimeric Proteins
  • RNA, Messenger