Simultaneous Critical Values For T-Tests In Very High Dimensions

Bernoulli (Andover). 2011 Feb;17(1):347-394. doi: 10.3150/10-BEJ272.

Abstract

This article considers the problem of multiple hypothesis testing using t-tests. The observed data are assumed to be independently generated conditional on an underlying and unknown two-state hidden model. We propose an asymptotically valid data-driven procedure to find critical values for rejection regions controlling k-family wise error rate (k-FWER), false discovery rate (FDR) and the tail probability of false discovery proportion (FDTP) by using one-sample and two-sample t-statistics. We only require finite fourth moment plus some very general conditions on the mean and variance of the population by virtue of the moderate deviations properties of t-statistics. A new consistent estimator for the proportion of alternative hypotheses is developed. Simulation studies support our theoretical results and demonstrate that the power of a multiple testing procedure can be substantially improved by using critical values directly as opposed to the conventional p-value approach. Our method is applied in an analysis of the microarray data from a leukemia cancer study that involves testing a large number of hypotheses simultaneously.