Intergenic repeat units of 127-bp (RU-1) and 168-bp (RU-2), as well as a newly-found class of 103-bp (RU-3), represent small mobile sequences in enterobacterial genomes present in multiple intergenic regions. These repeat sequences display similarities to eukaryotic miniature inverted-repeat transposable elements (MITE). The RU mobile elements have not been reported to encode amino acid sequences. An in silico approach was used to scan genomes for location of repeat units. RU sequences are found to have open reading frames, which are present in annotated gene loci whereby the RU amino acid sequence is maintained. Gene loci that display repeat units include those that encode large proteins which are part of super families that carry conserved domains and those that carry predicted motifs such as signal peptide sequences and transmembrane domains. A putative exported protein in Y. pestis and a phylogenetically conserved putative inner membrane protein in Salmonella species represent some of the more interesting constructs. We hypothesize that a major outcome of RU open reading frame fusions is the evolutionary emergence of new proteins.
Keywords: gene evolution; intergenic repeat units; intragenic insertions; mobile elements; protein domains.