pHMM-tree: phylogeny of profile hidden Markov models

Bioinformatics. 2017 Apr 1;33(7):1093-1095. doi: 10.1093/bioinformatics/btw779.

Abstract

Protein families are often represented by profile hidden Markov models (pHMMs). Homology between two distant protein families can be determined by comparing the pHMMs. Here we explored the idea of building a phylogeny of protein families using the distance matrix of their pHMMs. We developed a new software and web server (pHMM-tree) to allow four major types of inputs: (i) multiple pHMM files, (ii) multiple aligned protein sequence files, (iii) mixture of pHMM and aligned sequence files and (iv) unaligned protein sequences in a single file. The output will be a pHMM phylogeny of different protein families delineating their relationships. We have applied pHMM-tree to build phylogenies for CAZyme (carbohydrate active enzyme) classes and Pfam clans, which attested its usefulness in the phylogenetic representation of the evolutionary relationship among distant protein families.

Availability and implementation: This software is implemented in C/C ++ and is available at http://cys.bios.niu.edu/pHMM-Tree/source/.

Contact: zhanghan@nankai.edu.cn or yyin@niu.edu.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Algorithms
  • Markov Chains
  • Phylogeny*
  • Proteins / classification*
  • Sequence Alignment*
  • Sequence Analysis, Protein*
  • Software*

Substances

  • Proteins