Machine learning of motor vehicle accident categories from narrative data

M R Lehto; G S Sorock

Machine learning of motor vehicle accident categories from narrative data

Methods Inf Med. 1996 Dec;35(4-5):309-16.

Authors

M R Lehto¹, G S Sorock

Affiliation

¹ School of Industrial Engineering, Purdue University, West Lafayette, IN, USA.

PMID: 9019094

Abstract

Bayesian inferencing as a machine learning technique was evaluated for identifying pre-crash activity and crash type from accident narratives describing 3,686 motor vehicle crashes. It was hypothesized that a Bayesian model could learn from a computer search for 63 keywords related to accident categories. Learning was described in terms of the ability to accurately classify previously unclassifiable narratives not containing the original keywords. When narratives contained keywords, the results obtained using both the Bayesian model and keyword search corresponded closely to expert ratings (P(detection) > or = 0.9, and P (false positive) < or = 0.05). For narratives not containing keywords, when the threshold used by the Bayesian model was varied between p > 0.5 and p > 0.9, the overall probability of detecting a category assigned by the expert varied between 67% and 12%. False positives correspondingly varied between 32% and 3%. These latter results demonstrated that the Bayesian system learned from the results of the keyword searches.

MeSH terms

Accidents, Traffic*
Bayes Theorem*
Epidemiologic Methods*
Fuzzy Logic*
Humans
Models, Theoretical
Reproducibility of Results
Wounds and Injuries / epidemiology*
Wounds and Injuries / etiology