A novel data-driven prognostic model for staging of colorectal cancer

J Am Coll Surg. 2011 Nov;213(5):579-588, 588.e1-2. doi: 10.1016/j.jamcollsurg.2011.08.006. Epub 2011 Sep 16.


Background: The aim of this study was to develop a novel prognostic model that captures complex interplay among clinical and histologic factors to predict survival of patients with colorectal cancer after a radical potentially curative resection.

Study design: Survival data of 2,505 colon cancer and 2,430 rectal cancer patients undergoing radical colorectal resection between 1969 and 2007 were analyzed by random forest technology. The effect of TNM and non-TNM factors such as histologic grade, lymph node ratio (number positive/number resected), type of operation, neoadjuvant and adjuvant treatment, American Society of Anesthesiologists (ASA) class, and age in staging and prognosis were evaluated. A forest of 1,000 random survival trees was grown using log-rank splitting. Competing risk-adjusted random survival forest methods were used to maximize survival prediction and produce importance measures of the predictor variables.

Results: Competing risk-adjusted 5-year survival after resection of colon and rectal cancer was dominated by pT stage (ie, tumor infiltration depth) and lymph node ratio. Increased lymph node ratio was associated with worse survival within the same pT stage for both colon and rectal cancer patients. Whereas survival for colon cancer was affected by ASA grade, the type of resection and neoadjuvant therapy had a strong effect on rectal cancer survival. A similar pattern in predicted survival rates was observed for patients with fewer than 12 lymph nodes examined. Our model suggests that lymph node ratio remains a significant predictor of survival in this group.

Conclusions: A novel data-driven methodology predicts the survival times of patients with colorectal cancer and identifies patterns of cancer characteristics. The methods lead to stage groupings that could redefine the composition of TNM in a simple and orderly way. The higher predictive power of lymph node ratio as compared with traditional pN lymph node stage has specific implications and may address the important question of accuracy of staging in patients when fewer than 12 nodes are identified in the resection specimen.

MeSH terms

  • Aged
  • Aged, 80 and over
  • Chemotherapy, Adjuvant
  • Colectomy* / methods
  • Colonic Neoplasms / mortality
  • Colonic Neoplasms / pathology
  • Colorectal Neoplasms / mortality*
  • Colorectal Neoplasms / pathology*
  • Colorectal Neoplasms / therapy
  • Disease-Free Survival
  • Female
  • Humans
  • Kaplan-Meier Estimate
  • Lymph Nodes / pathology*
  • Lymphatic Metastasis
  • Male
  • Middle Aged
  • Models, Statistical*
  • Neoadjuvant Therapy / methods
  • Neoplasm Grading
  • Neoplasm Staging
  • Predictive Value of Tests
  • Prognosis
  • Proportional Hazards Models
  • Radiotherapy, Adjuvant
  • Rectal Neoplasms / mortality
  • Rectal Neoplasms / pathology
  • Registries