How to get statistically significant effects in any ERP experiment (and why you shouldn't)

Psychophysiology. 2017 Jan;54(1):146-157. doi: 10.1111/psyp.12639.


ERP experiments generate massive datasets, often containing thousands of values for each participant, even after averaging. The richness of these datasets can be very useful in testing sophisticated hypotheses, but this richness also creates many opportunities to obtain effects that are statistically significant but do not reflect true differences among groups or conditions (bogus effects). The purpose of this paper is to demonstrate how common and seemingly innocuous methods for quantifying and analyzing ERP effects can lead to very high rates of significant but bogus effects, with the likelihood of obtaining at least one such bogus effect exceeding 50% in many experiments. We focus on two specific problems: using the grand-averaged data to select the time windows and electrode sites for quantifying component amplitudes and latencies, and using one or more multifactor statistical analyses. Reanalyses of prior data and simulations of typical experimental designs are used to show how these problems can greatly increase the likelihood of significant but bogus results. Several strategies are described for avoiding these problems and for increasing the likelihood that significant effects actually reflect true differences among groups or conditions.

Keywords: Analysis/statistical methods; ERPs; Other.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computer Simulation
  • Data Interpretation, Statistical
  • Electroencephalography / methods*
  • Evoked Potentials*
  • Humans
  • Multivariate Analysis
  • Psychophysiology / methods*
  • Reproducibility of Results
  • Research Design
  • Signal Processing, Computer-Assisted