Predicting juvenile offending: a comparison of data mining methods

Int J Offender Ther Comp Criminol. 2013 Feb;57(2):191-207. doi: 10.1177/0306624X11431132. Epub 2011 Dec 12.

Abstract

In this study, the authors compared logistic regression and predictive data mining techniques such as decision trees (DTs), artificial neural networks (ANNs), and support vector machines (SVMs), and examined these methods on whether they could discriminate between adolescents who were charged or not charged for initial juvenile offending in a large Asian sample. Results were validated and tested in independent samples with logistic regression and DT, ANN, and SVM classifiers achieving accuracy rates of 95% and above. Findings from receiver operating characteristic analyses also supported these results. In addition, the authors examined distinct patterns of occurrences within and across classifiers. Proactive aggression and teacher-rated conflict consistently emerged as risk factors across validation and testing data sets of DT and ANN classifiers, and logistic regression. Reactive aggression, narcissistic exploitativeness, being male, and coming from a nonintact family were risk factors that emerged in one or more of these data sets across classifiers, while anxiety and poor peer relationships failed to emerge as predictors.

Publication types

  • Comparative Study

MeSH terms

  • Adolescent
  • Aged, 80 and over
  • Aggression / psychology
  • Anxiety Disorders / diagnosis
  • Anxiety Disorders / psychology
  • Child
  • Crime / legislation & jurisprudence
  • Crime / psychology
  • Data Mining / legislation & jurisprudence
  • Data Mining / methods*
  • Decision Trees*
  • Female
  • Humans
  • Juvenile Delinquency / legislation & jurisprudence
  • Juvenile Delinquency / prevention & control
  • Juvenile Delinquency / psychology*
  • Male
  • Neural Networks, Computer*
  • Personality Disorders / diagnosis
  • Personality Disorders / psychology
  • ROC Curve
  • Risk Factors
  • Secondary Prevention
  • Singapore
  • Statistics as Topic
  • Support Vector Machine*
  • Surveys and Questionnaires