The Bayes factor, HDI-ROPE, and frequentist equivalence tests can all be reverse engineered-Almost exactly-From one another: Reply to Linde et al. (2021)

Psychol Methods. 2024 Mar 21. doi: 10.1037/met0000507. Online ahead of print.

Abstract

Following an extensive simulation study comparing the operating characteristics of three different procedures used for establishing equivalence (the frequentist "TOST," the Bayesian "HDI-ROPE," and the Bayes factor interval null procedure), Linde et al. (2021) conclude with the recommendation that "researchers rely more on the Bayes factor interval null approach for quantifying evidence for equivalence" (p. 1). We redo the simulation study of Linde et al. (2021) in its entirety but with the different procedures calibrated to have the same predetermined maximum Type I error rate. Our results suggest that, when calibrated in this way, the Bayes factor, HDI-ROPE, and frequentist equivalence tests all have similar-almost exactly-Type II error rates. In general any advocating for frequentist testing as better or worse than Bayesian testing in terms of empirical findings seems dubious at best. If one decides on which underlying principle to subscribe to in tackling a given problem, then the method follows naturally. Bearing in mind that each procedure can be reverse-engineered from the others (at least approximately), trying to use empirical performance to argue for 1 approach over another seems like tilting at windmills. (PsycInfo Database Record (c) 2024 APA, all rights reserved).