HH-suite3 for fast remote homology detection and deep protein annotation
- PMID: 31521110
- PMCID: PMC6744700
- DOI: 10.1186/s12859-019-3019-7
HH-suite3 for fast remote homology detection and deep protein annotation
Abstract
Background: HH-suite is a widely used open source software suite for sensitive sequence similarity searches and protein fold recognition. It is based on pairwise alignment of profile Hidden Markov models (HMMs), which represent multiple sequence alignments of homologous proteins.
Results: We developed a single-instruction multiple-data (SIMD) vectorized implementation of the Viterbi algorithm for profile HMM alignment and introduced various other speed-ups. These accelerated the search methods HHsearch by a factor 4 and HHblits by a factor 2 over the previous version 2.0.16. HHblits3 is ∼10× faster than PSI-BLAST and ∼20× faster than HMMER3. Jobs to perform HHsearch and HHblits searches with many query profile HMMs can be parallelized over cores and over cluster servers using OpenMP and message passing interface (MPI). The free, open-source, GPLv3-licensed software is available at https://github.com/soedinglab/hh-suite .
Conclusion: The added functionalities and increased speed of HHsearch and HHblits should facilitate their use in large-scale protein structure and function prediction, e.g. in metagenomics and genomics projects.
Keywords: Algorithm; Functional annotation; Homology detection; Profile HMM; Protein alignment; SIMD; Sequence search.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures
Similar articles
-
Protein homology detection by HMM-HMM comparison.Bioinformatics. 2005 Apr 1;21(7):951-60. doi: 10.1093/bioinformatics/bti125. Epub 2004 Nov 5. Bioinformatics. 2005. PMID: 15531603
-
MMseqs software suite for fast and deep clustering and searching of large protein sequence sets.Bioinformatics. 2016 May 1;32(9):1323-30. doi: 10.1093/bioinformatics/btw006. Epub 2016 Jan 6. Bioinformatics. 2016. PMID: 26743509
-
Accelerated Profile HMM Searches.PLoS Comput Biol. 2011 Oct;7(10):e1002195. doi: 10.1371/journal.pcbi.1002195. Epub 2011 Oct 20. PLoS Comput Biol. 2011. PMID: 22039361 Free PMC article.
-
Profile hidden Markov models.Bioinformatics. 1998;14(9):755-63. doi: 10.1093/bioinformatics/14.9.755. Bioinformatics. 1998. PMID: 9918945 Review.
-
Five hierarchical levels of sequence-structure correlation in proteins.Appl Bioinformatics. 2004;3(2-3):97-104. doi: 10.2165/00822942-200403020-00004. Appl Bioinformatics. 2004. PMID: 15693735 Review.
Cited by
-
Structural Modeling of T9SS Outer Membrane Proteins and Their Complexes.Methods Mol Biol. 2024;2778:331-344. doi: 10.1007/978-1-0716-3734-0_20. Methods Mol Biol. 2024. PMID: 38478287
-
Diversity of sugar-diphospholipid-utilizing glycosyltransferase families.Commun Biol. 2024 Mar 7;7(1):285. doi: 10.1038/s42003-024-05930-2. Commun Biol. 2024. PMID: 38454040 Free PMC article.
-
ECOD domain classification of 48 whole proteomes from AlphaFold Structure Database using DPAM2.PLoS Comput Biol. 2024 Feb 28;20(2):e1011586. doi: 10.1371/journal.pcbi.1011586. eCollection 2024 Feb. PLoS Comput Biol. 2024. PMID: 38416793 Free PMC article.
-
Drug-target affinity prediction with extended graph learning-convolutional networks.BMC Bioinformatics. 2024 Feb 16;25(1):75. doi: 10.1186/s12859-024-05698-6. BMC Bioinformatics. 2024. PMID: 38365583 Free PMC article.
-
AlignScape, displaying sequence similarity using self-organizing maps.Front Bioinform. 2024 Jan 26;4:1321508. doi: 10.3389/fbinf.2024.1321508. eCollection 2024. Front Bioinform. 2024. PMID: 38343649 Free PMC article.
References
-
- Eddy SR. A new generation of homology search tools based on probabilistic inference. Genome Inform. 2009;23(1):205–11. - PubMed
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
