Secondary structure prediction with support vector machines

J J Ward; L J McGuffin; B F Buxton; D T Jones

doi:10.1093/bioinformatics/btg223

Secondary structure prediction with support vector machines

Bioinformatics. 2003 Sep 1;19(13):1650-5. doi: 10.1093/bioinformatics/btg223.

Authors

J J Ward¹, L J McGuffin, B F Buxton, D T Jones

Affiliation

¹ Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK.

PMID: 12967961
DOI: 10.1093/bioinformatics/btg223

Abstract

Motivation: A new method that uses support vector machines (SVMs) to predict protein secondary structure is described and evaluated. The study is designed to develop a reliable prediction method using an alternative technique and to investigate the applicability of SVMs to this type of bioinformatics problem.

Methods: Binary SVMs are trained to discriminate between two structural classes. The binary classifiers are combined in several ways to predict multi-class secondary structure.

Results: The average three-state prediction accuracy per protein (Q(3)) is estimated by cross-validation to be 77.07 +/- 0.26% with a segment overlap (Sov) score of 73.32 +/- 0.39%. The SVM performs similarly to the 'state-of-the-art' PSIPRED prediction method on a non-homologous test set of 121 proteins despite being trained on substantially fewer examples. A simple consensus of the SVM, PSIPRED and PROFsec achieves significantly higher prediction accuracy than the individual methods.

Publication types

Comparative Study
Evaluation Study
Research Support, Non-U.S. Gov't
Validation Study

MeSH terms

Algorithms*
Artificial Intelligence*
Benchmarking
Cluster Analysis*
Computing Methodologies
Models, Statistical*
Pattern Recognition, Automated
Protein Structure, Secondary
Proteins / chemistry*
Proteins / classification
Reproducibility of Results
Sensitivity and Specificity
Sequence Alignment / methods*
Sequence Analysis, Protein / methods*

Substances

Proteins