STR2: a structure to string approach for locating G-box riboswitch shapes in pre-selected genes

In Silico Biol. 2004;4(4):593-604.


Traditional sequence-based search methods such as BLAST and FASTA can be used to identify sequence similarities. Recently, there is a growing interest in performing RNA shape similarity searches inside selected genes to locate RNA structure motifs that are known to possess functionally important roles. For example, in the newly discovered RNA genetic control elements called "riboswitches", the box domain is known to be highly conserved among various bacterial species in both its nucleotide composition and shape. However, in non-bacterial species, shape conservation is likely to become more important than sequence conservation when searching for riboswitch patterns. For this purpose, we present an approach tailored for detecting RNA shape similarities. We extend the Structure to String (ST R2) method that was initially proposed to locate shape similarities in proteins to identify predicted secondary structures of RNAs. The ST R2 for RNAs is a translation of a secondary structure to a string of characters, after which known sequence-based search algorithms with an efficient implementation are being used. We validate that the ST R2 succeeds to locate G-box riboswitches in prokaryotes, as expected. Subsequently we show running examples when attempting to detect G-box riboswitch candidates in eukaryotes.

MeSH terms

  • Algorithms*
  • Bacillus / genetics
  • Base Sequence
  • Computational Biology*
  • Molecular Sequence Data
  • Nucleic Acid Conformation
  • Protein Biosynthesis / genetics
  • Purines / biosynthesis
  • RNA / chemistry*
  • RNA, Bacterial / chemistry*
  • RNA, Bacterial / genetics
  • RNA, Fungal / chemistry*
  • RNA, Fungal / genetics
  • Ribosomes / genetics
  • Saccharomyces cerevisiae / genetics
  • Sequence Analysis, RNA / methods


  • Purines
  • RNA, Bacterial
  • RNA, Fungal
  • RNA