Mining geriatric assessment data for in-patient fall prediction models and high-risk subgroups

BMC Med Inform Decis Mak. 2012 Mar 14:12:19. doi: 10.1186/1472-6947-12-19.

Abstract

Background: Hospital in-patient falls constitute a prominent problem in terms of costs and consequences. Geriatric institutions are most often affected, and common screening tools cannot predict in-patient falls consistently. Our objectives are to derive comprehensible fall risk classification models from a large data set of geriatric in-patients' assessment data and to evaluate their predictive performance (aim#1), and to identify high-risk subgroups from the data (aim#2).

Methods: A data set of n = 5,176 single in-patient episodes covering 1.5 years of admissions to a geriatric hospital were extracted from the hospital's data base and matched with fall incident reports (n = 493). A classification tree model was induced using the C4.5 algorithm as well as a logistic regression model, and their predictive performance was evaluated. Furthermore, high-risk subgroups were identified from extracted classification rules with a support of more than 100 instances.

Results: The classification tree model showed an overall classification accuracy of 66%, with a sensitivity of 55.4%, a specificity of 67.1%, positive and negative predictive values of 15% resp. 93.5%. Five high-risk groups were identified, defined by high age, low Barthel index, cognitive impairment, multi-medication and co-morbidity.

Conclusions: Our results show that a little more than half of the fallers may be identified correctly by our model, but the positive predictive value is too low to be applicable. Non-fallers, on the other hand, may be sorted out with the model quite well. The high-risk subgroups and the risk factors identified (age, low ADL score, cognitive impairment, institutionalization, polypharmacy and co-morbidity) reflect domain knowledge and may be used to screen certain subgroups of patients with a high risk of falling. Classification models derived from a large data set using data mining methods can compete with current dedicated fall risk screening tools, yet lack diagnostic precision. High-risk subgroups may be identified automatically from existing geriatric assessment data, especially when combined with domain knowledge in a hybrid classification model. Further work is necessary to validate our approach in a controlled prospective setting.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Accidental Falls / statistics & numerical data*
  • Aged
  • Aged, 80 and over
  • Data Mining*
  • Decision Trees
  • Episode of Care
  • Geriatric Assessment*
  • Hospitalization / statistics & numerical data
  • Hospitalization / trends
  • Humans
  • Inpatients / classification*
  • Logistic Models
  • Patient Admission
  • Predictive Value of Tests
  • Risk Assessment / methods*
  • Vulnerable Populations