In the analysis of current life science datasets, we often encounter scenarios in which the application of asymptotic theory to hypothesis testing can be problematic. Besides improved asymptotic results, permutation/simulation-based tests are a general approach to address this issue. However, these randomized tests can impose a massive computational burden, for example, in scenarios in which large numbers of statistical tests are computed, and the specified significance level is very small. Stopping rules aim to assess significance with the smallest possible number of draws while controlling the probabilities of errors due to statistical uncertainty. In this communication, we derive a general stopping rule, QUICK-STOP, based on the sequential testing theory that is easy to implement, controls the error probabilities rigorously, and is nearly optimal in terms of expected draws. In a simulation study, we show that our approach outperforms current stopping approaches for general randomized tests by factor 10 and does not impose an additional computational burden. We illustrate our approach by applying our stopping rule to a single-variant analysis of a whole-genome sequencing study for lung function.
Keywords: association p-value; next-generation sequencing; permutation; randomized test; sequential testing.
© 2019 Wiley Periodicals, Inc.