Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jun 20;54:12.13.1-12.13.25.
doi: 10.1002/cpbi.4.

Studying RNA Homology and Conservation With Infernal: From Single Sequences to RNA Families

Affiliations
Free PMC article

Studying RNA Homology and Conservation With Infernal: From Single Sequences to RNA Families

Lars Barquist et al. Curr Protoc Bioinformatics. .
Free PMC article

Abstract

Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remain difficult. This unit introduces methods developed by the Rfam database for identifying "families" of homologous ncRNAs starting from single "seed" sequences, using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs and then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process. © 2016 by John Wiley & Sons, Inc.

Keywords: RNA; Rfam; alignment; conservation; covariance model; homology; ncRNA.

Figures

Figure 1
Figure 1. Partial results of a BLAST search using the E. coli MicA sequence from the “nr” sequence database.
The tabular view provides important information that can be used to pick putative homolog sequences including species and strain information (column 1), query sequence coverage (column 4), E-value (column 5), and percent identity (column 6). Also note the accession number in column 7, this will be useful looking up sequences in other databases (e.g. ENA).
Figure 2
Figure 2. Genomic context of micA.
ENA browser view of the region surrounding micA in E. coli. All of our selected homologs show conserved synteny with the luxS and gshA genes, providing additional evidence for their evolutionary relationship. Note the Rfam annotation in this region, matching our sequence. This view can be generated by navigating to http://www.ebi.ac.uk/ena/data/view/U00096.3 and entering the genome coordinates in the “Base range” boxes.
Figure 3
Figure 3. Consensus alignment of MicA sequences.
Visualization of a T-coffee consensus alignment incorporating information from 6 different alignment and structure prediction methods on the WAR webserver.
Figure 4
Figure 4. Editing alignments in emacs RALEE mode.
Screenshots showing three different color markups of the MicA consensus alignment highlighting secondary structure (top), sequence conservation (middle), and compensatory mutations (bottom).
Figure 5
Figure 5. MicA alignment following manual refinement.
Manual refinement using RALEE restores full conservation to the first stem-loop and removes an unlikely insertion within the hairpin structure. Additionally careful manipulation of the sequence between the two hairpin reveals strong conservation of a short A/U-rich sequence motif that was previously obscure.
Figure 6
Figure 6. MicA alignment containing additional homologs found using Plan B.
Two new sequences have been added to the alignment (D. zeae, CP001655; S. glossinidius, AP008232), adding structural and sequence diversity.
Figure 7
Figure 7. Xenorhabdus MicA homologs found using Plan A.
The top panel shows the location of a putative micA homolog with a marginal Infernal E-value in X. nematophila, sharing synteny with established micA sequences in E. coli. The bottom panel shows a manually refined alignment of three putative Xenorhabdus MicA sequences, displaying structural and sequence similarity to our previously constructed alignment of enterobacterial MicA sequences.

Similar articles

See all similar articles

Cited by 6 articles

See all "Cited by" articles

LinkOut - more resources

Feedback