The G-quadruplex is one of the most frequently studied secondary DNA structures and consists of 4 guanine residues that interact through Watson-Crick and Hoogsteen pairing. The G-quadruplex formation is thought to be a molecular switch for gene expression. Genome-wide analyses of G-quadruplexes have been published for many species; however, only one genome-wide analysis of G-quadruplexes in plants has been reported. Here, we propose a new approach involving a two-step procedure for identifying G-quadruplex-forming sequences (potential G4 DNA motif regions: G4MRs) and classifying positional relationships between G4MRs and genes. By using this approach, we exhaustively searched for G4MRs in the whole genomes of 8 species: Arabidopsis thaliana, Oryza sativa subsp. japonica, Populus trichocarpa, Vitis vinifera, Homo sapiens, Danio rerio, Drosophila melanogaster, and Caenorhabditis elegans. We classified genes on the basis of their positional relationships to their proximal G4MRs. We identified novel rules for G4MRs in plants, such as G4MR-enrichment in the template strands at transcription start sites (TSSs). Next, we focused on the template strands of TSSs and conducted gene ontology (GO) analysis of genes proximal to G4MRs. We identified GO terms such as chloroplast and nucleosome (or histone) in O. sativa. Although these terms were strongly associated in O. sativa, weak associations were identified in other plants. These results will be helpful for elucidating the functional roles of G4 DNA.
Copyright © 2012 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.