HHsvm: fast and accurate classification of profile-profile matches identified by HHsearch

Bioinformatics. 2009 Dec 1;25(23):3071-6. doi: 10.1093/bioinformatics/btp555. Epub 2009 Sep 22.

Abstract

Motivation: Recently developed profile-profile methods rival structural comparisons in their ability to detect homology between distantly related proteins. Despite this tremendous progress, many genuine relationships between protein families cannot be recognized as comparisons of their profiles result in scores that are statistically insignificant.

Results: Using known evolutionary relationships among protein superfamilies in SCOP database, support vector machines were trained on four sets of discriminatory features derived from the output of HHsearch. Upon validation, it was shown that the automatic classification of all profile-profile matches was superior to fixed threshold-based annotation in terms of sensitivity and specificity. The effectiveness of this approach was demonstrated by annotating several domains of unknown function from the Pfam database.

Availability: Programs and scripts implementing the methods described in this manuscript are freely available from http://hhsvm.dlakiclab.org/.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computational Biology / methods*
  • Databases, Protein
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Sequence Homology, Amino Acid*
  • Software*

Substances

  • Proteins