Statistical grand rounds: a review of analysis and sample size calculation considerations for Wilcoxon tests

Anesth Analg. 2013 Sep;117(3):699-710. doi: 10.1213/ANE.0b013e31827f53d7. Epub 2013 Mar 1.


When a study uses an ordinal outcome measure with unknown differences in the anchors and a small range such as 4 or 7, use of the Wilcoxon rank sum test or the Wilcoxon signed rank test may be most appropriate. However, because nonparametric methods are at best indirect functions of standard measures of location such as means or medians, the choice of the most appropriate summary measure can be difficult. The issues underlying use of these tests are discussed. The Wilcoxon-Mann-Whitney odds directly reflects the quantity that the rank sum procedure actually tests, and thus it can be a superior summary measure. Unlike the means and medians, its value will have a one-to-one correspondence with the Wilcoxon rank sum test result. The companion article appearing in this issue of Anesthesia & Analgesia ("Aromatherapy as Treatment for Postoperative Nausea: A Randomized Trial") illustrates these issues and provides an example of a situation for which the medians imply no difference between 2 groups, even though the groups are, in fact, quite different. The trial cited also provides an example of a single sample that has a median of zero, yet there is a substantial shift for much of the nonzero data, and the Wilcoxon signed rank test is quite significant. These examples highlight the potential discordance between medians and Wilcoxon test results. Along with the issues surrounding the choice of a summary measure, there are considerations for the computation of sample size and power, confidence intervals, and multiple comparison adjustment. In addition, despite the increased robustness of the Wilcoxon procedures relative to parametric tests, some circumstances in which the Wilcoxon tests may perform poorly are noted, along with alternative versions of the procedures that correct for such limitations.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Analgesics, Opioid / adverse effects
  • Anesthesiology
  • Confidence Intervals
  • Data Interpretation, Statistical*
  • Humans
  • Models, Statistical
  • Postoperative Nausea and Vomiting / epidemiology
  • Randomized Controlled Trials as Topic
  • Research Design
  • Sample Size*
  • Teaching Rounds


  • Analgesics, Opioid