Power and sample size estimation for the Wilcoxon rank sum test with application to comparisons of C statistics from alternative prediction models

Biometrics. 2009 Mar;65(1):188-97. doi: 10.1111/j.1541-0420.2008.01062.x. Epub 2008 May 28.


The Wilcoxon Mann-Whitney (WMW) U test is commonly used in nonparametric two-group comparisons when the normality of the underlying distribution is questionable. There has been some previous work on estimating power based on this procedure (Lehmann, 1998, Nonparametrics). In this article, we present an approach for estimating type II error, which is applicable to any continuous distribution, and also extend the approach to handle grouped continuous data allowing for ties. We apply these results to obtaining standard errors of the area under the receiver operating characteristic curve (AUROC) for risk-prediction rules under H(1) and for comparing AUROC between competing risk prediction rules applied to the same data set. These results are based on SAS-callable functions to evaluate the bivariate normal integral and are thus easily implemented with standard software.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Area Under Curve
  • Biometry / methods*
  • Data Interpretation, Statistical
  • Humans
  • Models, Theoretical
  • ROC Curve
  • Sample Size
  • Statistics, Nonparametric*