Integrative data analysis indicates an intrinsic disordered domain character of Argonaute-binding motifs

Bioinformatics. 2015 Feb 1;31(3):332-9. doi: 10.1093/bioinformatics/btu666. Epub 2014 Oct 9.

Abstract

Motivation: Argonaute-interacting WG/GW proteins are characterized by the presence of repeated sequence motifs containing glycine (G) and tryptophan (W). The motifs seem to be remarkably adaptive to amino acid substitutions and their sequences show non-contiguity. Our previous approach to the detection of GW domains, based on scoring their gross amino acid composition, allowed annotation of several novel proteins involved in gene silencing. The accumulation of new experimental data and more advanced applications revealed some deficiency of the algorithm in prediction selectivity. Additionally, W-motifs, though critical in gene regulation, have not yet been annotated in any available online resources.

Results: We present an improved set of computational tools allowing efficient management and annotation of W-based motifs involved in gene silencing. The new prediction algorithms provide novel functionalities by annotation of the W-containing domains at the local sequence motif level rather than by overall compositional properties. This approach represents a significant improvement over the previous method in terms of prediction sensitivity and selectivity. Application of the algorithm allowed annotation of a comprehensive list of putative Argonaute-interacting proteins across eukaryotes. An in-depth characterization of the domains' properties indicates its intrinsic disordered character. In addition, we created a knowledge-based portal (whub) that provides access to tools and information on RNAi-related tryptophan-containing motifs.

Availability and implementation: The web portal and tools are freely available at http://www.comgen.pl/whub.

Contact: wmk@amu.edu.pl

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Motifs / genetics*
  • Argonaute Proteins / chemistry*
  • Argonaute Proteins / genetics
  • Argonaute Proteins / metabolism*
  • Glycine / chemistry*
  • Protein Binding / genetics*
  • Protein Structure, Tertiary
  • Repetitive Sequences, Amino Acid / genetics*
  • Software
  • Tryptophan / chemistry*

Substances

  • Argonaute Proteins
  • Tryptophan
  • Glycine