Dichotomizing continuous predictors in multiple regression: a bad idea

Stat Med. 2006 Jan 15;25(1):127-41. doi: 10.1002/sim.2331.


In medical research, continuous variables are often converted into categorical variables by grouping values into two or more categories. We consider in detail issues pertaining to creating just two groups, a common approach in clinical research. We argue that the simplicity achieved is gained at a cost; dichotomization may create rather than avoid problems, notably a considerable loss of power and residual confounding. In addition, the use of a data-derived 'optimal' cutpoint leads to serious bias. We illustrate the impact of dichotomization of continuous predictor variables using as a detailed case study a randomized trial in primary biliary cirrhosis. Dichotomization of continuous data is unnecessary for statistical analysis and in particular should not be applied to explanatory variables in regression models.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Age Factors
  • Albumins / analysis
  • Antimetabolites / pharmacology
  • Azathioprine / pharmacology
  • Bilirubin / analysis
  • Cholestasis / drug therapy
  • Data Interpretation, Statistical*
  • Humans
  • Liver Cirrhosis, Biliary / drug therapy
  • Randomized Controlled Trials as Topic / methods*
  • Regression Analysis*


  • Albumins
  • Antimetabolites
  • Azathioprine
  • Bilirubin