Convolutional neural network ensemble for Parkinson's disease detection from voice recordings

Comput Biol Med. 2022 Feb:141:105021. doi: 10.1016/j.compbiomed.2021.105021. Epub 2021 Nov 9.

Abstract

The computerized detection of Parkinson's disease (PD) will facilitate population screening and frequent monitoring and provide a more objective measure of symptoms, benefiting both patients and healthcare providers. Dysarthria is an early symptom of the disease and examining it for computerized diagnosis and monitoring has been proposed. Deep learning-based approaches have advantages for such applications because they do not require manual feature extraction, and while this approach has achieved excellent results in speech recognition, its utilization in the detection of pathological voices is limited. In this work, we present an ensemble of convolutional neural networks (CNNs) for the detection of PD from the voice recordings of 50 healthy people and 50 people with PD obtained from PC-GITA, a publicly available database. We propose a multiple-fine-tuning method to train the base CNN. This approach reduces the semantical gap between the source task that has been used for network pretraining and the target task by expanding the training process by including training on another dataset. Training and testing were performed for each vowel separately, and a 10-fold validation was performed to test the models. The performance was measured by using accuracy, sensitivity, specificity and area under the ROC curve (AUC). The results show that this approach was able to distinguish between the voices of people with PD and those of healthy people for all vowels. While there were small differences between the different vowels, the best performance was when/a/was considered; we achieved 99% accuracy, 86.2% sensitivity, 93.3% specificity and 89.6% AUC. This shows that the method has potential for use in clinical practice for the screening, diagnosis and monitoring of PD, with the advantage that vowel-based voice recordings can be performed online without requiring additional hardware.

Keywords: Automatic voice analysis; CNN ensemble; Convolutional neural network; Parkinson's disease; Transfer learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Factual
  • Humans
  • Neural Networks, Computer
  • Parkinson Disease* / diagnosis
  • Speech
  • Voice*