The use of classification and regression trees in clinical epidemiology

J Clin Epidemiol. 2001 Jun;54(6):603-9. doi: 10.1016/s0895-4356(00)00344-9.


A critique is presented of the use of tree-based partitioning algorithms to formulate classification rules and identify subgroups from clinical and epidemiological data. It is argued that the methods have a number of limitations, despite their popularity and apparent closeness to clinical reasoning processes. The issue of redundancy in tree-derived decision rules is discussed. Simple rules may be unlikely to be "discovered" by tree growing. Subgroups identified by trees are often hard to interpret or believe and net effects are not assessed. These problems arise fundamentally because trees are hierarchical. Newer refinements of tree technology seem unlikely to be useful, wedded as they are to hierarchical structures.

MeSH terms

  • Algorithms*
  • Epidemiologic Research Design*
  • Humans
  • Regression Analysis