Intramolecular G-quadruplexes (G4s) are secondary structures that may form within G-rich stretches of nucleic acids. Although their presence has been associated with genomic instability and mutagenicity, recent reports suggest their involvement in regulation of diverse cellular events, including transcription and translation. The majority of data regarding G4s stems from mammalian and yeast studies, leaving the plant G4s almost unexplored. Using the publicly available Arabidopsis thaliana and Oryza sativa WGS data, we examined the single nucleotide variability of sequences predicted to form G4s (pG4s) structures. We focused our analysis on protein coding transcripts and compared the results to well-characterized Homo sapiens data. We demonstrate that the overall high variability of pG4s is not uniform and differs between gene structural elements. Specifically, plant AUG-containing pG4s, located within 5'UTR/CDS junctions, are abundant and appear not to be affected by a higher frequency of sequence change, indicating their functional relevance. Furthermore, we show that substitutions lowering the probability of G4s' formation are preferred over neutral or stabilizing modifications.
Keywords: Arabidopsis; G-quadruplex; RNA; plants; rice; sequence variability.