Weighted Area Under the Receiver Operating Characteristic Curve and Its Application to Gene Selection

J R Stat Soc Ser C Appl Stat. 2010 Aug;59(4):673-692. doi: 10.1111/j.1467-9876.2010.00713.x.

Abstract

Partial area under the ROC curve (PAUC) has been proposed for gene selection in Pepe et al. (2003) and thereafter applied in real data analysis. It was noticed from empirical studies that this measure has several key weaknesses, such as an inability to reflect nonuniform weighting of different decision thresholds, resulting in large numbers of ties. We propose the weighted area under the ROC curve (WAUC) in this paper to address the problems associated with PAUC. Our proposed measure enjoys a greater flexibility to describe the discrimination accuracy of genes. Nonparametric and parametric estimation methods are introduced, including PAUC as a special case, along with theoretical properties of the estimators. We also provide a simple variance formula, yielding a novel variance estimator for nonparametric estimation of PAUC, which has proven challenging in previous work. The proposed methods permit sensitivity analyses, whereby the impact of differing weight functions on gene rankings may be assessed and results may be synthesized across weights. Simulations and re-analysis of two well-known microarray datasets illustrate the practical utility of WAUC.

Keywords: Empirical distribution; Gene selection; Location-scale model; Partial area under the curve; Random threshold; Weighted area under the curve.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't