A machine learning method to process voice samples for identification of Parkinson's disease

Anu Iyer; Aaron Kemp; Yasir Rahmatallah; Lakshmi Pillai; Aliyah Glover; Fred Prior; Linda Larson-Prior; Tuhin Virmani

doi:10.1038/s41598-023-47568-w

A machine learning method to process voice samples for identification of Parkinson's disease

Sci Rep. 2023 Nov 23;13(1):20615. doi: 10.1038/s41598-023-47568-w.

Authors

Anu Iyer^#¹, Aaron Kemp^#², Yasir Rahmatallah³, Lakshmi Pillai⁴, Aliyah Glover⁴, Fred Prior³, Linda Larson-Prior^{3

4

5}, Tuhin Virmani^{3

4}

Affiliations

¹ Georgia Institute of Technology, Atlanta, 30332, USA.
² Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, 72205, USA. ASKemp@uams.edu.
³ Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, 72205, USA.
⁴ Neurology, University of Arkansas for Medical Sciences, Little Rock, 72205, USA.
⁵ Neurobiology and Developmental Sciences, University of Arkansas for Medical Sciences, Little Rock, 72205, USA.

^# Contributed equally.

Abstract

Machine learning approaches have been used for the automatic detection of Parkinson's disease with voice recordings being the most used data type due to the simple and non-invasive nature of acquiring such data. Although voice recordings captured via telephone or mobile devices allow much easier and wider access for data collection, current conflicting performance results limit their clinical applicability. This study has two novel contributions. First, we show the reliability of personal telephone-collected voice recordings of the sustained vowel /a/ in natural settings by collecting samples from 50 people with specialist-diagnosed Parkinson's disease and 50 healthy controls and applying machine learning classification with voice features related to phonation. Second, we utilize a novel application of a pre-trained convolutional neural network (Inception V3) with transfer learning to analyze the spectrograms of the sustained vowel from these samples. This approach considers speech intensity estimates across time and frequency scales rather than collapsing measurements across time. We show the superiority of our deep learning model for the task of classifying people with Parkinson's disease as distinct from healthy controls.

MeSH terms

Humans
Machine Learning
Parkinson Disease* / diagnosis
Phonation
Reproducibility of Results
Voice*

Abstract

MeSH terms

Grants and funding