Automatic diagnosis of vocal fold paresis by employing phonovibrogram features and machine learning methods

Comput Methods Programs Biomed. 2010 Sep;99(3):275-88. doi: 10.1016/j.cmpb.2010.01.004.


The clinical diagnosis of voice disorders is based on examination of the rapidly moving vocal folds during phonation (f0: 80-300Hz) with state-of-the-art endoscopic high-speed cameras. Commonly, analysis is performed in a subjective and time-consuming manner via slow-motion video playback and exhibits low inter- and intra-rater reliability. In this study an objective method to overcome this drawback is presented being based on Phonovibrography, a novel image analysis technique. For a collective of 45 normophonic and paralytic voices the laryngeal dynamics were captured by specialized Phonovibrogram features and analyzed with different machine learning algorithms. Classification accuracies reached 93% for 2-class and 73% for 3-class discrimination. The results were validated by subjective expert ratings given the same diagnostic criteria. The automatic Phonovibrogram analysis approach exceeded the experienced raters' classifications by 9%. The presented method holds a lot of potential for providing reliable vocal fold diagnosis support in the future.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Algorithms
  • Artificial Intelligence*
  • Computer Simulation
  • Data Interpretation, Statistical
  • Diagnosis, Computer-Assisted*
  • Female
  • Humans
  • Image Interpretation, Computer-Assisted / instrumentation
  • Image Interpretation, Computer-Assisted / methods
  • Laryngoscopy / instrumentation*
  • Pattern Recognition, Automated
  • Phonation
  • Video Recording / instrumentation
  • Video Recording / methods
  • Vocal Cord Paralysis / diagnosis*
  • Vocal Cord Paralysis / pathology
  • Vocal Cords / pathology*