In prokaryotes, translation initiation typically depends on complementary binding between a G-rich Shine-Dalgarno (SD) motif in the 5' untranslated region of mRNAs, and the 3' tail of the 16S ribosomal RNA (the anti-SD sequence). In some cases, internal SD-like motifs in the coding region generate "programmed" ribosomal pauses that are beneficial for protein folding or accurate targeting. On the other hand, such pauses can also reduce protein production, generating purifying selection against internal SD-like motifs. This selection should be stronger in GC-rich genomes that are more likely to harbor the G-rich SD motif. However, the nature and consequences of selection acting on internal SD-like motifs within genomes and across species remains unclear. We analyzed the frequency of SD-like hexamers in the coding regions of 284 prokaryotes (277 with known anti-SD sequences and 7 without a typical SD mechanism). After accounting for GC content, we found that internal SD-like hexamers are avoided in 230 species, including three without a typical SD mechanism. The degree of avoidance was higher in GC-rich genomes, mesophiles, and N-terminal regions of genes. In contrast, 54 species either showed no signature of avoidance or were enriched in internal SD-like motifs. C-terminal gene regions were relatively enriched in SD-like hexamers, particularly for genes in operons or those followed closely by downstream genes. Together, our results suggest that the frequency of internal SD-like hexamers is governed by multiple factors including GC content and genome organization, and further empirical work is necessary to understand the evolution and functional roles of these motifs.
Keywords: GC content; Shine–Dalgarno sequence; anti-SD affinity; hexamer frequency; translational pausing.
© The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.