Outlier removal, sum scores, and the inflation of the Type I error rate in independent samples t tests: the power of alternatives and recommendations

Psychol Methods. 2014 Sep;19(3):409-27. doi: 10.1037/met0000014. Epub 2014 Apr 28.


In psychology, outliers are often excluded before running an independent samples t test, and data are often nonnormal because of the use of sum scores based on tests and questionnaires. This article concerns the handling of outliers in the context of independent samples t tests applied to nonnormal sum scores. After reviewing common practice, we present results of simulations of artificial and actual psychological data, which show that the removal of outliers based on commonly used Z value thresholds severely increases the Type I error rate. We found Type I error rates of above 20% after removing outliers with a threshold value of Z = 2 in a short and difficult test. Inflations of Type I error rates are particularly severe when researchers are given the freedom to alter threshold values of Z after having seen the effects thereof on outcomes. We recommend the use of nonparametric Mann-Whitney-Wilcoxon tests or robust Yuen-Welch tests without removing outliers. These alternatives to independent samples t tests are found to have nominal Type I error rates with a minimal loss of power when no outliers are present in the data and to have nominal Type I error rates and good power when outliers are present.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Computer Simulation*
  • Humans
  • Models, Statistical*
  • Statistics as Topic*
  • Statistics, Nonparametric*