Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 May;25(5):573-589.
doi: 10.1261/rna.068551.118. Epub 2019 Feb 21.

Going Beyond Base-Pairs: Topology-Based Characterization of Base-Multiplets in RNA

Affiliations
Free PMC article

Going Beyond Base-Pairs: Topology-Based Characterization of Base-Multiplets in RNA

Sohini Bhattacharya et al. RNA. .
Free PMC article

Abstract

Identification and characterization of base-multiplets, which are essentially mediated by base-pairing interactions, can provide insights into the diversity in the structure and dynamics of complex functional RNAs, and thus facilitate hypothesis driven biological research. The necessary nomenclature scheme, an extension of the geometric classification scheme for base-pairs by Leontis and Westhof, is however available only for base-triplets. In the absence of information on topology, this scheme is not applicable to quartets and higher order multiplets. Here we propose a topology-based classification scheme which, in conjunction with a graph-based algorithm, can be used for the automated identification and characterization of higher order base-multiplets in RNA structures. Here, the RNA structure is represented as a graph, where nodes represent nucleotides and edges represent base-pairing connectivity. Sets of connected components (of n nodes) within these graphs constitute subgraphs representing multiplets of "n" nucleotides. The different topological variants of the RNA multiplets thus correspond to different nonisomorphic forms of these subgraphs. To annotate RNA base-multiplets unambiguously, we propose a set of topology-based nomenclature rules for quartets, which are extendable to higher multiplets. We also demonstrate the utility of our approach toward the identification and annotation of higher order RNA multiplets, by investigating the occurrence contexts of selected examples in order to gain insights regarding their probable functional roles.

Keywords: RNA as graph; RNA structural bioinformatics; RNA structural elements; graph mining; nomenclature of RNA base-quartets; topology of RNA base-multiplets.

Figures

FIGURE 1.
FIGURE 1.
(A) Three distinct base-pairing edges of the RNA bases: the Watson–Crick edge (W), the Hoogsteen edge (H) (C-H edge in pyrimidines), and the Sugar edge (S). A given edge of one base can potentially pair up with any one of the three edges of a second base, which have compatible hydrogen bond donors and acceptors, to form a base-pair. (B) Two bases can approach in either cis or trans orientation of the glycosidic bonds around the axis defined by drawing a line parallel to and between the hydrogen bonds joining the edges. Broken lines represent interbase hydrogen bonding interactions.
FIGURE 2.
FIGURE 2.
Possible topological varieties in (A) Triplets, (B) Quartets, and (C) Pentets. Among these, for quartets, only Q1, Q2, Q3, and Q4 are observed in nature; and for Pentets, P1, P2, P3, P8 (rare), and P12 (rare) are found in nature. Colored circles represent nodes with different degrees. An accurate nomenclature of higher order multiplets requires an unambiguous assignment of numbers to these nodes. This can be achieved by implementing the priority rules discussed in this work and is illustrated in Supplemental Figures S1 and S2.
FIGURE 3.
FIGURE 3.
Geometry of CGCG quartets in (A) linear topology having base-pairs 1259C:1276G W:WC (blue), 1276G:1282C S:S C (magenta), and 1282C:1255G W:W C (green); and in (B) cyclic-4 topology having base-pairs 20C:28G W:W C (blue), 28G:31C S:SC (magenta), 31C:19G W:W C (green), and 19G:20 C S:S C (orange). Broken lines represent interbase hydrogen bonding interactions.
FIGURE 4.
FIGURE 4.
(A) Schematic representation and nomenclature rules for four different quartet topologies. An example from each topology has also been shown. (B) Nomenclature of Q5 and Q6 quartet topologies.
FIGURE 5.
FIGURE 5.
(A) Example of a quartet having Q1 or linear topology. The 2800C-G2643-2820A-2900G cWW-tSS-tHS quartet observed in T. thermophilus 23S rRNA. (B) Example of a quartet having Q2 or star topology. 2448G[2610A 2095C 2257U] tSW/cWW/cHW quartet present in T. thermophilus 23S rRNA.
FIGURE 6.
FIGURE 6.
(A) Example of a pentet having P1 or linear topology. 2775A-2575C-2558G-2799A-2774U cSS-cWW-cSS-tHW pentet observed in H. marismortui 23S rRNA. (B) Example of P2 pentet topology. 184G[153C[439A] 186A 150G] [cWW-cSS]/cSS/tHW pentet present in H. marismortui 23S rRNA. (C) The conserved structure of the P3 pentet topology, 547A[<397A 37U> 498U[404U]] cSH/tWS/[tHW-cHW], tWW pentet observed in T thermophilus 16S rRNA. (D) P8 or cyclic pentet 37A-106C-39G-104A-36G-> cHW-tWW-tSH-tWS-cSW-> is observed in 5.8S rRNA of S. cerevisiae. Constituent base-pairs are separately shown in boxes. (E) Example of P12 or star pentet- 2349G[644A 645C 2368C 2382G] tSW/tSW/cWW/tHW observed in E. coli 23S rRNA, which mediates interactions between nucleotides of domain-II (backbone shown in dark blue color) and domain-VII (backbone shown in sky blue color).
FIGURE 7.
FIGURE 7.
Structural context of RNA base-multiplets beyond base-pentets. (A1) Sextet element found in class-I preQ1 riboswitch, which mediates pseudoknot formation between S1 helix (green) and L3 loop (sky blue). (A2) Sextet element found in SAM-II riboswitch bound to S-adenosylmethionine, which mediates pseudoknot formation between P1 helix (purple) and L3 loop (light green). (A3) Sextet element found in aptamer domain of M-box riboswitch. This sextet element mediates interaction between P2 helix (pink), P4 helix (red), L4 loop (sky blue), and L5 loop (light green). (B1) Structural context of octet element in domain-I of H. marismortui 23S rRNA. (B2) Structural context of octet element in domain-I of T. thermophilus 23S rRNA. (B3) The conserved topology of octet elements shown in B1 and B2. (B4) Structural context of octet element in D. radiodurans 23S rRNA. (B5) Topology of the octet element shown in B4. (C1) The molecular view of the structural context of a nonet in T. thermophilus 23S rRNA (Domain-I). Four different colors are used to differentiate between four distantly placed RNA stretches and corresponding schematic representation of the interaction pattern of constituent nucleotides. (C2) Secondary structure map of domain-I of T. thermophilus 23S rRNA, where interacting regions are highlighted in colored boxes.

Similar articles

See all similar articles

Cited by 1 article

Publication types

MeSH terms

LinkOut - more resources

Feedback