Training a machine learning classifier to identify ADHD based on real-world clinical data from medical records

Sci Rep. 2022 Jul 28;12(1):12934. doi: 10.1038/s41598-022-17126-x.

Abstract

The diagnostic process of attention deficit hyperactivity disorder (ADHD) is complex and relies on criteria sensitive to subjective biases. This may cause significant delays in appropriate treatment initiation. An automated analysis relying on subjective and objective measures might not only simplify the diagnostic process and reduce the time to diagnosis, but also improve reproducibility. While recent machine learning studies have succeeded at distinguishing ADHD from healthy controls, the clinical process requires differentiating among other or multiple psychiatric conditions. We trained a linear support vector machine (SVM) classifier to detect participants with ADHD in a population showing a broad spectrum of psychiatric conditions using anonymized data from clinical records (N = 299 participants). We differentiated children and adolescents with ADHD from those not having the condition with an accuracy of 66.1%. SVM using single features showed slight differences between features and overlapping standard deviations of the achieved accuracies. An automated feature selection achieved the best performance using a combination 19 features. Real-world clinical data from medical records can be used to automatically identify individuals with ADHD among help-seeking individuals using machine learning. The relevant diagnostic information can be reduced using an automated feature selection without loss of performance. A broad combination of symptoms across different domains, rather than specific domains, seems to indicate an ADHD diagnosis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Attention Deficit Disorder with Hyperactivity* / diagnosis
  • Attention Deficit Disorder with Hyperactivity* / psychology
  • Child
  • Humans
  • Machine Learning
  • Medical Records
  • Reproducibility of Results
  • Support Vector Machine