Coding sequences of functioning human genes derived entirely from mobile element sequences

Proc Natl Acad Sci U S A. 2004 Nov 30;101(48):16825-30. doi: 10.1073/pnas.0406985101. Epub 2004 Nov 16.

Abstract

Among all of the many examples of mobile elements or "parasitic sequences" that affect the function of the human genome, this paper describes several examples of functioning genes whose sequences have been almost completely derived from mobile elements. There are many examples where the synthetic coding sequences of observed mRNA sequences are made up of mobile element sequences, to an extent of 80% or more of the length of the coding sequences. In the examples described here, the genes have named functions, and some of these functions have been studied. It appears that each of the functioning genes was originally formed from mobile elements and that in some process of molecular evolution a coding sequence was derived that could be translated into a protein that is of some importance to human biology. In one case (AD7C), the coding sequence is 99% made up of a cluster of Alu sequences. In another example, the gene BNIP3 coding sequence is 97% made up of sequences from an apparent human endogenous retrovirus. The Syncytin gene coding sequence appears to be made from an endogenous retrovirus envelope gene.

MeSH terms

  • Genome, Human*
  • Humans
  • Open Reading Frames*
  • RNA, Messenger / genetics

Substances

  • RNA, Messenger