Classification tree analysis: a statistical tool to investigate risk factor interactions with an example for colon cancer (United States)

Cancer Causes Control. 2002 Nov;13(9):813-23. doi: 10.1023/a:1020611416907.


Objective: Classification tree analysis is a potentially powerful tool for investigating multilevel interactions. Within the context of colon cancer etiology it may help identify disease pathways and evaluate important interactions of risk factors.

Methods: We apply classification tree analysis as a statistical method to investigate interactions of risk factors for colon cancer. We use data collected from a population-based case-control study of newly diagnosed cases of colon cancer (N = 4403 cases and controls).

Results: Our results indicate that, as expected, there are many factors that influence colon cancer risk, and that they interact on many levels. We find that the most important factor is the utilization of aspirin and/or non-steroidal anti-inflammatory drugs (NSAID), with those taking this medication having lower risk. Family history appears as a level two modifying factor when NSAID are not used, whereas Western diet is the second factor when NSAID are taken. The final tree has six levels, contains several modifying factors and correctly classifies case or control status for 60.8% (95% CI 59.4-62.2) of all individuals.

Conclusions: Our results suggest that risk factors work together to determine disease risk. By accounting for interactions between risk factors we become better able to dissect disease pathways and determine those risk factors that increase susceptibility to disease. Our results highlight the importance of designing studies so that interactions can be addressed.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Adult
  • Aged
  • Anti-Inflammatory Agents, Non-Steroidal / administration & dosage
  • Body Mass Index
  • Case-Control Studies
  • Colonic Neoplasms / epidemiology*
  • Colonic Neoplasms / etiology
  • Colonic Neoplasms / genetics
  • Decision Trees*
  • Diet
  • Female
  • Genetic Predisposition to Disease
  • Humans
  • Male
  • Middle Aged
  • Physical Fitness
  • Predictive Value of Tests
  • Risk Factors
  • Statistics as Topic / methods
  • United States / epidemiology


  • Anti-Inflammatory Agents, Non-Steroidal