Decision threshold adjustment in class prediction

SAR QSAR Environ Res. 2006 Jun;17(3):337-52. doi: 10.1080/10659360600787700.

Abstract

Standard classification algorithms are generally designed to maximize the number of correct predictions (concordance). The criterion of maximizing the concordance may not be appropriate in certain applications. In practice, some applications may emphasize high sensitivity (e.g., clinical diagnostic tests) and others may emphasize high specificity (e.g., epidemiology screening studies). This paper considers effects of the decision threshold on sensitivity, specificity, and concordance for four classification methods: logistic regression, classification tree, Fisher's linear discriminant analysis, and a weighted k-nearest neighbor. We investigated the use of decision threshold adjustment to improve performance of either sensitivity or specificity of a classifier under specific conditions. We conducted a Monte Carlo simulation showing that as the decision threshold increases, the sensitivity decreases and the specificity increases; but, the concordance values in an interval around the maximum concordance are similar. For specified sensitivity and specificity levels, an optimal decision threshold might be determined in an interval around the maximum concordance that meets the specified requirement. Three example data sets were analyzed for illustrations.

Publication types

  • Comparative Study

MeSH terms

  • Algorithms
  • Animals
  • Artificial Intelligence
  • Classification*
  • Colonic Neoplasms / genetics
  • Computer Simulation
  • Databases, Factual
  • Decision Support Techniques*
  • Decision Trees
  • Discriminant Analysis
  • Gene Expression Profiling
  • Humans
  • Liver Neoplasms / chemically induced
  • Logistic Models
  • Monte Carlo Method
  • Receptors, Estrogen / metabolism
  • Structure-Activity Relationship

Substances

  • Receptors, Estrogen