Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Feb;23(2):240-249.
doi: 10.1261/rna.058404.116. Epub 2016 Nov 7.

RiboCAT: A New Capillary Electrophoresis Data Analysis Tool for Nucleic Acid Probing

Affiliations
Free PMC article

RiboCAT: A New Capillary Electrophoresis Data Analysis Tool for Nucleic Acid Probing

William A Cantara et al. RNA. .
Free PMC article

Abstract

Chemical and enzymatic probing of RNA secondary structure and RNA/protein interactions provides the basis for understanding the functions of structured RNAs. However, the ability to rapidly perform such experiments using capillary electrophoresis has been hampered by relatively labor-intensive data analysis software. While these computationally robust programs have been shown to calculate residue-specific reactivities to a high degree of accuracy, they often require time-consuming manual intervention and lack the ability to be easily modified by users. To alleviate these issues, RiboCAT (Ribonucleic acid capillary-electrophoresis analysis tool) was developed as a user-friendly, Microsoft Excel-based tool that reduces the need for manual intervention, thereby significantly shortening the time required for data analysis. Features of this tool include (i) the use of an Excel platform, (ii) a method of intercapillary signal alignment using internal size standards, (iii) a peak-sharpening algorithm to more accurately identify peaks, and (iv) an open architecture allowing for simple user intervention. Furthermore, a complementary tool, RiboDOG (RiboCAT data output generator) was designed to facilitate the comparison of multiple data sets, highlighting potential inconsistencies and inaccuracies that may have occurred during analysis. Using these new tools, the secondary structure of the HIV-1 5' untranslated region (5'UTR) was determined using selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE), matching the results of previous work.

Keywords: RNA structure; SHAPE; capillary electrophoresis; secondary structure.

Figures

FIGURE 1.
FIGURE 1.
Improvements to CE processing steps reduce data analysis errors. (A) Four primary steps are carried out during analysis of CE data; signal alignment, data preprocessing, sequence alignment, and reactivity calculations. The two main areas of improvement, signal alignment and peak picking, are highlighted in green dashed boxes. (B) Errors associated with signal alignment in QuSHAPE are eliminated using an improved alignment strategy that utilizes internal size standards. Blue and red lines indicate the minus and plus traces, respectively, (same in C). (C) The frequency of misidentified peaks in QuSHAPE is significantly reduced in RiboCAT via introduction of a peak sharpening algorithm. Dots indicate picked peaks. Orange and blue arrows denote missed or incorrectly picked peaks, respectively.
FIGURE 2.
FIGURE 2.
Signal alignment using an internal size standard. (A) Optimization of the polynomial order used for signal alignment. The inset shows a zoomed region encompassing orders 5–10 with RMSD values reaching a plateau at an average of X0 = 1.2 at order 9. (B) The unaligned (top) and aligned (bottom) size standards for three separate CE experiments of sequencing (black), no reagent control (blue), and SHAPE reaction (red). (C) Aligned SHAPE reaction traces for no reagent control (blue) and SHAPE reagent (red) show highly accurate signal alignment based on the internal size standard.
FIGURE 3.
FIGURE 3.
Improved peak picking using a peak-sharpening protocol. (A) The peak-sharpening algorithm computes an enhanced data trace (dashed gray line) that has significantly exaggerated the peaks and troughs compared to the raw data trace (solid black line). (B) The enhanced data allow for more robust peak assignment. In the example shown, five out of 10 peaks can be assigned using the raw data (blue triangles), whereas all 10 peaks are properly assigned when using the peak-sharpened, enhanced data (red circles).
FIGURE 4.
FIGURE 4.
Schematic description of sequence alignment in RiboCAT. (A) To optimize the assignment of peaks to their corresponding nucleotides in the RNA sequence, an initial guess of the alignment is made (peaks labeled “y”) based on the x-axis similarities in peaks from the sequencing (top) and experimental (bottom) traces. The alignment is then shifted incrementally to the left (peaks labeled “x”) or incrementally to the right (peaks labeled “z”), and the RMSD between the x-axis values are calculated. The minimum RMSD over the entire trace is chosen as the correct alignment. (B) In standard Sanger sequencing, the DNA fragment corresponding to a particular residue will include the cognate dideoxy nucleotide (top); however, in both chemical probing methods (middle) and RNase/chemical digestion (bottom), the cognate nucleotide is prohibited from being incorporated by either a chemical modification or backbone cleavage, respectively, leading to an offset of −1.
FIGURE 5.
FIGURE 5.
RiboCAT replicates QuSHAPE-derived reactivities in the HIV-1 5′UTR. The lowest energy secondary structures calculated by RNAStructure using reactivity values derived using NMIA (A) and 1M6 (B). Sites of mutation, A34U and ΔDIS, are noted with the WT DIS shown. The reactivities at each nucleotide are depicted as colored circles matching the legend in the middle. (C) SHAPE reactivities calculated using QuSHAPE (red) and RiboCAT (black outline) plotted for each nucleotide show a high degree of similarity with regions of high and low reactivity matching in both plots. (D) The SHAPE reactivities calculated using RiboCAT and QuSHAPE are very comparable with a Pearson's R-value of 0.96.

Similar articles

See all similar articles

Cited by 3 articles

Publication types

LinkOut - more resources

Feedback