Power to the People: Power, Negative Results and Sample Size

J Am Assoc Lab Anim Sci. 2020 Jan 1;59(1):9-16. doi: 10.30802/AALAS-JAALAS-19-000042. Epub 2019 Dec 18.

Abstract

The practical application of statistical power is becoming an increasingly important part of experimental design, data analysis, and reporting. Power is essential to estimating sample size as part of planning studies and obtaining ethical approval for them. Furthermore, power is essential for publishing and interpreting negative results. In this manuscript, we review what power is, how it can be calculated, and reporting recommendations if a null result is found. Power can be thought of as reflecting the signal to noise ratio of an experiment. The conventional wisdom that statistical power is driven by sample size (which increases the signal in the data), while true, is a misleading oversimplification. Relatively little discussion covers the use of experimental designs which control and reduce noise. Even small improvements in experimental design can achieve high power at much lower sample sizes than (for instance) a simple t test. Failure to report experimental design or the proposed statistical test on animal care and use protocols creates a dilemma for IACUCs, because it is unknown whether sample size has been correctly calculated. Traditional power calculations, which are primarily provided for animal number justifications, are only available for simple, yet low powered, experimental designs, such as paired t tests. Thus, in most controlled experimental studies, the only analyses for which power can be calculated are those that inheriently have low statistical power; these analyses should not be used because they require more animals than necessary. We provide suggestions for more powerful experimental designs (such as randomized block and factorial designs) that increase power, and we describe methods to easily calculate sample size for these designs that are suitable for IACUC number justifications. Finally we also provide recommendations for reporting negative results, so that readers and reviewers can determine whether an experiment had sufficient power. The use of more sophisticated designs in animal experiments will inevitably improve power, reproducibility, and reduce animal use.

MeSH terms

  • Animal Experimentation*
  • Animals
  • Humans
  • Negative Results
  • Reproducibility of Results
  • Research Design*
  • Sample Size*