Machine learning of motor vehicle accident categories from narrative data

Methods Inf Med. 1996 Dec;35(4-5):309-16.

Abstract

Bayesian inferencing as a machine learning technique was evaluated for identifying pre-crash activity and crash type from accident narratives describing 3,686 motor vehicle crashes. It was hypothesized that a Bayesian model could learn from a computer search for 63 keywords related to accident categories. Learning was described in terms of the ability to accurately classify previously unclassifiable narratives not containing the original keywords. When narratives contained keywords, the results obtained using both the Bayesian model and keyword search corresponded closely to expert ratings (P(detection) > or = 0.9, and P (false positive) < or = 0.05). For narratives not containing keywords, when the threshold used by the Bayesian model was varied between p > 0.5 and p > 0.9, the overall probability of detecting a category assigned by the expert varied between 67% and 12%. False positives correspondingly varied between 32% and 3%. These latter results demonstrated that the Bayesian system learned from the results of the keyword searches.

MeSH terms

  • Accidents, Traffic*
  • Bayes Theorem*
  • Epidemiologic Methods*
  • Fuzzy Logic*
  • Humans
  • Models, Theoretical
  • Reproducibility of Results
  • Wounds and Injuries / epidemiology*
  • Wounds and Injuries / etiology