Fractal dimensions of speech sounds: computation and application to automatic speech recognition

J Acoust Soc Am. 1999 Mar;105(3):1925-32. doi: 10.1121/1.426738.

Abstract

The dynamics of airflow during speech production may often result in some small or large degree of turbulence. In this paper, the geometry of speech turbulence as reflected in the fragmentation of the time signal is quantified by using fractal models. An efficient algorithm for estimating the short-time fractal dimension of speech signals based on multiscale morphological filtering is described, and its potential for speech segmentation and phonetic classification discussed. Also reported are experimental results on using the short-time fractal dimension of speech signals at multiple scales as additional features in an automatic speech-recognition system using hidden Markov models, which provide a modest improvement in speech-recognition performance.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Databases as Topic
  • Electronic Data Processing
  • Fractals*
  • Humans
  • Markov Chains
  • Models, Biological
  • Phonetics
  • Sound*
  • Speech / physiology*
  • Speech Perception / physiology*
  • Speech Production Measurement
  • Time Factors