CutProtFam-Pred: detection and classification of putative structural cuticular proteins from sequence alone, based on profile hidden Markov models

Insect Biochem Mol Biol. 2014 Sep;52:51-9. doi: 10.1016/j.ibmb.2014.06.004. Epub 2014 Jun 27.

Abstract

The arthropod cuticle is a composite, bipartite system, made of chitin filaments embedded in a proteinaceous matrix. The physical properties of cuticle are determined by the structure and the interactions of its two major components, cuticular proteins (CPs) and chitin. The proteinaceous matrix consists mainly of structural cuticular proteins. The majority of the structural proteins that have been described to date belong to the CPR family, and they are identified by the conserved R&R region (Rebers and Riddiford Consensus). Two major subfamilies of the CPR family RR-1 and RR-2, have also been identified from conservation at sequence level and some correlation with the cuticle type. Recently, several novel families, also containing characteristic conserved regions, have been described. The package HMMER v3.0 (http://hmmer.janelia.org/) was used to build characteristic profile Hidden Markov Models based on the characteristic regions for 8 of these families, (CPF, CPAP3, CPAP1, CPCFC, CPLCA, CPLCG, CPLCW, Tweedle). In brief, these families can be described as having: CPF (a conserved region with 44 amino acids); CPAP1 and CPAP-3 (analogous to peritrophins, with 1 and 3 chitin-binding domains, respectively); CPCFC (2 or 3 C-x(5)-C repeats); and four of five low complexity (LC) families, each with characteristic domains. Using these models, as well as the models previously created for the two major subfamilies of the CPR family, RR-1 and RR-2 (Karouzou et al., 2007), we developed CutProtFam-Pred, an on-line tool (http://bioinformatics.biol.uoa.gr/CutProtFam-Pred) that allows one to query sequences from proteomes or translated transcriptomes, for the accurate detection and classification of putative structural cuticular proteins. The tool has been applied successfully to diverse arthropod proteomes including a crustacean (Daphnia pulex) and a chelicerate (Tetranychus urticae), but at this taxonomic distance only CPRs and CPAPs were recovered.

Keywords: Arthropod cuticle; Cuticular proteins; Profile Hidden Markov Models (pHMMs); Structural cuticular protein families.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Arthropod Proteins / chemistry*
  • Arthropod Proteins / genetics
  • Arthropod Proteins / metabolism
  • Arthropods / chemistry*
  • Arthropods / genetics*
  • Arthropods / metabolism*
  • Chitin / chemistry*
  • Chitin / genetics
  • Chitin / metabolism
  • Computational Biology / methods
  • Markov Chains
  • Molecular Sequence Data
  • Multigene Family
  • Phylogeny
  • Proteome
  • Sequence Alignment
  • Sequence Analysis, Protein

Substances

  • Arthropod Proteins
  • Proteome
  • Chitin