Identification of WD40 repeats by secondary structure-aided profile-profile alignment

J Theor Biol. 2016 Jun 7:398:122-9. doi: 10.1016/j.jtbi.2016.03.025. Epub 2016 Mar 25.

Abstract

A WD40 protein typically contains four or more repeats of ~40 residues ended with the Trp-Asp dipeptide, which folds into β-propellers with four β strands in each repeat. They often function as scaffolds for protein-protein interactions and are involved in numerous fundamental biological processes. Despite their important functional role, the "velcro" closure of WD40 propellers and the diversity of WD40 repeats make their identification a difficult task. Here we develop a new WD40 Repeat Recognition method (WDRR), which uses predicted secondary structure information to generate candidate repeat segments, and further employs a profile-profile alignment to identify the correct WD40 repeats from candidate segments. In particular, we design a novel alignment scoring function that combines dot product and BLOSUM62, thereby achieving a great balance of sensitivity and accuracy. Taking advantage of these strategies, WDRR could effectively reduce the false positive rate and accurately identify more remote homologous WD40 repeats with precise repeat boundaries. We further use WDRR to re-annotate the Pfam families in the β-propeller clan (CL0186) and identify a number of WD40 repeat proteins with high confidence across nine model organisms. The WDRR web server and the datasets are available at http://protein.cau.edu.cn/wdrr/.

Keywords: Bioinformatics; Profile–profile alignment; Remote sequence homology; Secondary structure prediction; WD40 repeat.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Crystallography, X-Ray
  • Mice
  • Pattern Recognition, Automated
  • Protein Domains
  • Protein Structure, Secondary
  • Sequence Alignment
  • WD40 Repeats*