Applying Classification Trees to Hospital Administrative Data to Identify Patients with Lower Gastrointestinal Bleeding

PLoS One. 2015 Sep 25;10(9):e0138987. doi: 10.1371/journal.pone.0138987. eCollection 2015.


Background: Lower gastrointestinal bleeding (LGIB) is a common cause of acute hospitalization. Currently, there is no accepted standard for identifying patients with LGIB in hospital administrative data. The objective of this study was to develop and validate a set of classification algorithms that use hospital administrative data to identify LGIB.

Methods: Our sample consists of patients admitted between July 1, 2001 and June 30, 2003 (derivation cohort) and July 1, 2003 and June 30, 2005 (validation cohort) to the general medicine inpatient service of the University of Chicago Hospital, a large urban academic medical center. Confirmed cases of LGIB in both cohorts were determined by reviewing the charts of those patients who had at least 1 of 36 principal or secondary International Classification of Diseases, Ninth revision, Clinical Modification (ICD-9-CM) diagnosis codes associated with LGIB. Classification trees were used on the data of the derivation cohort to develop a set of decision rules for identifying patients with LGIB. These rules were then applied to the validation cohort to assess their performance.

Results: Three classification algorithms were identified and validated: a high specificity rule with 80.1% sensitivity and 95.8% specificity, a rule that balances sensitivity and specificity (87.8% sensitivity, 90.9% specificity), and a high sensitivity rule with 100% sensitivity and 91.0% specificity.

Conclusion: These classification algorithms can be used in future studies to evaluate resource utilization and assess outcomes associated with LGIB without the use of chart review.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adult
  • Aged
  • Algorithms
  • Chicago
  • Female
  • Gastrointestinal Hemorrhage / diagnosis*
  • Hospitalization
  • Humans
  • International Classification of Diseases*
  • Male
  • Medical Records*
  • Middle Aged
  • Sensitivity and Specificity