Assessing the accuracy of prediction algorithms for classification: an overview

Bioinformatics. 2000 May;16(5):412-24. doi: 10.1093/bioinformatics/16.5.412.


We provide a unified overview of methods that currently are widely used to assess the accuracy of prediction algorithms, from raw percentages, quadratic error measures and other distances, and correlation coefficients, and to information theoretic measures such as relative entropy and mutual information. We briefly discuss the advantages and disadvantages of each approach. For classification tasks, we derive new learning algorithms for the design of prediction systems by directly optimising the correlation coefficient. We observe and prove several results relating sensitivity and specificity of optimal systems. While the principles are general, we illustrate the applicability on specific problems such as protein secondary structure and signal peptide prediction.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.
  • Review

MeSH terms

  • Algorithms*
  • Classification / methods*
  • Computational Biology
  • Learning
  • Models, Statistical
  • Neural Networks, Computer