An HMM model for coiled-coil domains and a comparison with PSSM-based predictions

Bioinformatics. 2002 Apr;18(4):617-25. doi: 10.1093/bioinformatics/18.4.617.


Motivation: Large-scale sequence data require methods for the automated annotation of protein domains. Many of the predictive methods are based either on a Position Specific Scoring Matrix (PSSM) of fixed length or on a window-less Hidden Markov Model (HMM). The performance of the two approaches is tested for Coiled-Coil Domains (CCDs). The prediction of CCDs is used frequently, and its optimization seems worthwhile.

Results: We have conceived MARCOIL, an HMM for the recognition of proteins with a CCD on a genomic scale. A cross-validated study suggests that MARCOIL improves predictions compared to the traditional PSSM algorithm, especially for some protein families and for short CCDs. The study was designed to reveal differences inherent in the two methods. Potential confounding factors such as differences in the dimension of parameter space and in the parameter values were avoided by using the same amino acid propensities and by keeping the transition probabilities of the HMM constant during cross-validation.

Availabilty: The prediction program and the databases are available at

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Databases, Protein*
  • Genome
  • Information Storage and Retrieval / methods
  • Markov Chains
  • Models, Genetic*
  • Models, Statistical
  • Molecular Sequence Data
  • Protein Conformation
  • Proteins / chemistry*
  • Proteins / classification
  • Proteins / genetics*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Sequence Analysis, Protein / methods*
  • Software*


  • Proteins