Given the availability of sequence information for many species, one can examine how the sequence of a gene varies among different organisms. This is accomplished by aligning the sequences and observing patterns of conservation, mutation and counter-mutation at different positions in the gene. Imbedded in these patterns is information on energetic coupling and macromolecular interactions, which can be deciphered by application of statistical algorithms. Here we report a robust approach for predicting interactions within (or between) any type of biopolymer, including proteins, RNAs and RNA-protein complexes. Rather than maximize the number of predictions, this approach is designed to detect a limited number of highly significant interactions, thereby providing accurate results from alignments that contain a modest number of sequences (20-60). The versatility and accuracy of the algorithm is demonstrated by the successful prediction of important intramolecular interactions within RNAs, modified RNAs, and proteins, as well as the prediction of RNA-protein and protein-protein interactions.
(c) 2005 Wiley-Liss, Inc.