29 mammalian genomes reveal novel exaptations of mobile elements for likely regulatory functions in the human genome

PLoS One. 2012;7(8):e43128. doi: 10.1371/journal.pone.0043128. Epub 2012 Aug 27.

Abstract

Recent research supports the view that changes in gene regulation, as opposed to changes in the genes themselves, play a significant role in morphological evolution. Gene regulation is largely dependent on transcription factor binding sites. Researchers are now able to use the available 29 mammalian genomes to measure selective constraint at the level of binding sites. This detailed map of constraint suggests that mammalian genomes co-opt fragments of mobile elements to act as gene regulatory sequence on a large scale. In the human genome we detect over 280,000 putative regulatory elements, totaling approximately 7 Mb of sequence, that originated as mobile element insertions. These putative regulatory regions are conserved non-exonic elements (CNEEs), which show considerable cross-species constraint and signatures of continued negative selection in humans, yet do not appear in a known mature transcript. These putative regulatory elements were co-opted from SINE, LINE, LTR and DNA transposon insertions. We demonstrate that at least 11%, and an estimated 20%, of gene regulatory sequence in the human genome showing cross-species conservation was co-opted from mobile elements. The location in the genome of CNEEs co-opted from mobile elements closely resembles that of CNEEs in general, except in the centers of the largest gene deserts where recognizable co-option events are relatively rare. We find that regions of certain mobile element insertions are more likely to be held under purifying selection than others. In particular, we show 6 examples where paralogous instances of an often co-opted mobile element region define a sequence motif that closely matches a transcription factor's binding profile.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • 5' Untranslated Regions
  • Animals
  • Binding Sites
  • Gene Frequency
  • Genome*
  • Genome, Human*
  • Humans
  • Mammals / genetics*
  • Models, Genetic
  • Models, Statistical
  • Phylogeny
  • Protein Binding
  • Regulatory Elements, Transcriptional*
  • Sequence Alignment
  • Transcription Factors / metabolism

Substances

  • 5' Untranslated Regions
  • Transcription Factors