Automatic detection of laryngeal pathologies in records of sustained vowels by means of mel-frequency cepstral coefficient parameters and differentiation of patients by sex

Folia Phoniatr Logop. 2009;61(3):146-52. doi: 10.1159/000219950. Epub 2009 Jul 1.

Abstract

Mel-frequency cepstral coefficients (MFCC) have traditionally been used in speaker identification applications. Their use has been extended to speech quality assessment for clinical applications during the last few years. While the significance of such parameters for such an application may not seem clear at first thought, previous research has demonstrated their robustness and statistical significance and, at the same time, their close relationship with glottal noise measurements. This paper includes a review of this parameterization scheme and it analyzes its performance for voice analysis when patients are differentiated by sex. While it is of common use for establishing normative values for traditional voice descriptors (e.g. pitch, jitter, formants), differentiation by sex had not been tested yet for cepstral analysis of voice with clinical purposes. This paper shows that the automatic detection of laryngeal pathology on voice records based on MFCC can significantly improve its performance by means of this prior differentiation by sex.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Adult
  • Electronic Data Processing / methods*
  • Female
  • Humans
  • Laryngeal Diseases / diagnosis*
  • Male
  • Middle Aged
  • Neural Networks, Computer
  • Phonetics*
  • Probability
  • Sex Characteristics*
  • Sound Spectrography
  • Speech
  • Young Adult