Over-represented sequences located on 3' UTRs are potentially involved in regulatory functions

RNA Biol. 2008 Oct-Dec;5(4):255-62. doi: 10.4161/rna.7116. Epub 2008 Oct 3.


Eukaryotic gene expression must be coordinated for the proper functioning of biological processes. This coordination can be achieved both at the transcriptional and post-transcriptional levels. In both cases, regulatory sequences placed at either promoter regions or on UTRs function as markers recognized by regulators that can then activate or repress different groups of genes according to necessity. While regulatory sequences involved in transcription are quite well documented, there is a lack of information on sequence elements involved in post-transcriptional regulation. We used a statistical over-representation method to identify novel regulatory elements located on UTRs. An exhaustive search approach was used to calculate the frequency of all possible n-mers (short nucleotide sequences) in 16,160 human genes of NCBI RefSeq sequences and to identify any peculiar usage of n-mers on UTRs. After a stringent filtering process, we identified 2,772 highly over-represented n-mers on 3' UTRs. We provide evidence that these n-mers are potentially involved in regulatory functions. Identified n-mers overlap with previously identified binding sites for HuR and TIA-1 and, ARE and GRE sequences. We determine also that n-mers overlap with predicted miRNA target sites. Finally, a method to cluster n-mer groups allowed the identification of putative gene networks.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • 3' Untranslated Regions / genetics*
  • Base Sequence
  • Binding Sites
  • Cluster Analysis
  • Gene Expression Regulation
  • Gene Regulatory Networks
  • Humans
  • MicroRNAs / genetics
  • Molecular Sequence Data
  • Operon / genetics
  • Regulatory Sequences, Nucleic Acid / genetics*
  • Transcription, Genetic


  • 3' Untranslated Regions
  • MicroRNAs