A guide to computational methods for G-quadruplex prediction

Nucleic Acids Res. 2020 Jan 10;48(1):1-15. doi: 10.1093/nar/gkz1097.

Abstract

Guanine-rich nucleic acids can fold into the non-B DNA or RNA structures called G-quadruplexes (G4). Recent methodological developments have allowed the characterization of specific G-quadruplex structures in vitro as well as in vivo, and at a much higher throughput, in silico, which has greatly expanded our understanding of G4-associated functions. Typically, the consensus motif G3+N1-7G3+N1-7G3+N1-7G3+ has been used to identify potential G-quadruplexes from primary sequence. Since, various algorithms have been developed to predict the potential formation of quadruplexes directly from DNA or RNA sequences and the number of studies reporting genome-wide G4 exploration across species has rapidly increased. More recently, new methodologies have also appeared, proposing other estimates which consider non-canonical sequences and/or structure propensity and stability. The present review aims at providing an updated overview of the current open-source G-quadruplex prediction algorithms and straightforward examples of their implementation.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Animals
  • Base Sequence
  • Benchmarking
  • Drosophila melanogaster / genetics
  • G-Quadruplexes*
  • Genome*
  • Guanine / chemistry*
  • Guanine / metabolism
  • Humans
  • Machine Learning
  • Mice
  • Models, Molecular
  • Models, Statistical*
  • Software*
  • Zebrafish / genetics

Substances

  • Guanine