Improving Inferences About Null Effects With Bayes Factors and Equivalence Tests

J Gerontol B Psychol Sci Soc Sci. 2020 Jan 1;75(1):45-57. doi: 10.1093/geronb/gby065.


Researchers often conclude an effect is absent when a null-hypothesis significance test yields a nonsignificant p value. However, it is neither logically nor statistically correct to conclude an effect is absent when a hypothesis test is not significant. We present two methods to evaluate the presence or absence of effects: Equivalence testing (based on frequentist statistics) and Bayes factors (based on Bayesian statistics). In four examples from the gerontology literature, we illustrate different ways to specify alternative models that can be used to reject the presence of a meaningful or predicted effect in hypothesis tests. We provide detailed explanations of how to calculate, report, and interpret Bayes factors and equivalence tests. We also discuss how to design informative studies that can provide support for a null model or for the absence of a meaningful effect. The conceptual differences between Bayes factors and equivalence tests are discussed, and we also note when and why they might lead to similar or different inferences in practice. It is important that researchers are able to falsify predictions or can quantify the support for predicted null effects. Bayes factors and equivalence tests provide useful statistical tools to improve inferences about null effects.

Keywords: Bayesian statistics; Falsification; Frequentist statistics; Hypothesis testing; TOST.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Aging / physiology*
  • Bayes Theorem
  • Biomedical Research / methods*
  • Chronic Pain / physiopathology
  • Data Interpretation, Statistical*
  • Emotional Regulation / physiology
  • Geriatrics / methods*
  • Humans
  • Male
  • Memory / physiology
  • Models, Statistical*
  • Personality / physiology
  • Psychology / methods*
  • Research Design*