Motivation: Argonaute-interacting WG/GW proteins are characterized by the presence of repeated sequence motifs containing glycine (G) and tryptophan (W). The motifs seem to be remarkably adaptive to amino acid substitutions and their sequences show non-contiguity. Our previous approach to the detection of GW domains, based on scoring their gross amino acid composition, allowed annotation of several novel proteins involved in gene silencing. The accumulation of new experimental data and more advanced applications revealed some deficiency of the algorithm in prediction selectivity. Additionally, W-motifs, though critical in gene regulation, have not yet been annotated in any available online resources.
Results: We present an improved set of computational tools allowing efficient management and annotation of W-based motifs involved in gene silencing. The new prediction algorithms provide novel functionalities by annotation of the W-containing domains at the local sequence motif level rather than by overall compositional properties. This approach represents a significant improvement over the previous method in terms of prediction sensitivity and selectivity. Application of the algorithm allowed annotation of a comprehensive list of putative Argonaute-interacting proteins across eukaryotes. An in-depth characterization of the domains' properties indicates its intrinsic disordered character. In addition, we created a knowledge-based portal (whub) that provides access to tools and information on RNAi-related tryptophan-containing motifs.
Availability and implementation: The web portal and tools are freely available at http://www.comgen.pl/whub.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: email@example.com.