Breaking up is hard to do: the heartbreak of dichotomizing continuous data

Can J Psychiatry. 2002 Apr;47(3):262-6. doi: 10.1177/070674370204700307.


Researchers often take variables that are measured on a continuum and then break them into categories (for example, above or below some cut-point), either to place subjects into groups or as an outcome measure. In this article, we show that the rationales given for this practice are weak and that categorization results in lost information, reduced power of statistical tests, and increased probability of a Type II error. Dichotomizing a continuous variable is justified only when the distribution of that variable is highly skewed or its relation with another variable is nonlinear.

MeSH terms

  • Bias
  • Clinical Trials as Topic / statistics & numerical data*
  • Data Interpretation, Statistical*
  • Humans
  • Outcome Assessment, Health Care / statistics & numerical data*
  • Psychiatric Status Rating Scales / statistics & numerical data*
  • Psychometrics