Neuronetwork Approach in the Early Diagnosis of Depression

Psychiatr Danub. 2023 Oct;35(Suppl 2):77-85.

Abstract

Background: Depression is a common mental illness, with around 280 million people suffering from depression worldwide. At present, the main way to quantify the severity of depression is through psychometric scales, which entail subjectivity on the part of both patient and clinician. In the last few years, deep (machine) learning is emerging as a more objective approach for measuring depression severity. We now investigate how neural networks might serve for the early diagnosis of depression.

Subjects and methods: We searched Medline (Pubmed) for articles published up to June 1, 2023. The search term included Depression AND Diagnostics AND Artificial Intelligence. We did not search for depression studies of machine learning other than neural networks, and selected only those papers attesting to diagnosis or screening for depression.

Results: Fifty-four papers met our criteria, among which 14 using facial expression recordings, 14 using EEG, 5 using fMRI, and 5 using audio speech recording analysis, whereas 6 used multimodality approach, two were the text analysis studies, and 8 used other methods.

Conclusions: Research methodologies include both audio and video recordings of clinical interviews, task performance, including their subsequent conversion into text, and resting state studies (EEG, MRI, fMRI). Convolutional neural networks (CNN), including 3D-CNN and 2D-CNN, can obtain diagnostic data from the videos of the facial area. Deep learning in relation to EEG signals is the most commonly used CNN. fMRI approaches use graph convolutional networks and 3D-CNN with voxel connectivity, whereas the text analyses use CNNs, including LSTM (long/short-term memory). Audio recordings are analyzed by a hybrid CNN and support vector machine model. Neural networks are used to analyze biomaterials, gait, polysomnography, ECG, data from wrist wearable devices, and present illness history records. Multimodality studies analyze the fusion of audio features with visual and textual features using LSTM and CNN architectures, a temporal convolutional network, or a recurrent neural network. The accuracy of different hybrid and multimodality models is 78-99%, relative to the standard clinical diagnoses.

Keywords: artificial intelligence – automated speech analysis - convolutional neural networks - deep learning – depression – early diagnosis – facial recognition – smartphone.

MeSH terms

  • Artificial Intelligence*
  • Depression* / diagnosis
  • Early Diagnosis
  • Humans
  • Machine Learning
  • Neural Networks, Computer