Efficient coding in human auditory perception

J Acoust Soc Am. 2009 Sep;126(3):1312-20. doi: 10.1121/1.3158939.


Natural sounds possess characteristic statistical regularities. Recent research suggests that mammalian auditory processing maximizes information about these regularities in its internal representation while minimizing encoding cost [Smith, E. C. and Lewicki, M. S. (2006). Nature (London) 439, 978-982]. Evidence for this "efficient coding hypothesis" comes largely from neurophysiology and theoretical modeling [Olshausen, B. A., and Field, D. (2004). Curr. Opin. Neurobiol. 14, 481-487; DeWeese, M., et al. (2003). J. Neurosci. 23, 7940-7949; Klein, D. J., et al. (2003). EURASIP J. Appl. Signal Process. 7, 659-667]. The present research provides behavioral evidence for efficient coding in human auditory perception using six-channel noise-vocoded speech, which drastically limits spectral information and degrades recognition accuracy. Two experiments compared recognition accuracy of vocoder speech created using theoretically-motivated, efficient coding filterbanks derived from the statistical regularities of speech against recognition using standard cochleotopic (logarithmic) or linear filterbanks. Recognition of the speech created using efficient encoding filterbanks was significantly more accurate than either of the other classes. These findings suggest potential applications to cochlear implant design.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Acoustic Stimulation
  • Humans
  • Linear Models
  • Models, Psychological
  • Psychoacoustics*
  • Recognition, Psychology
  • Sound Spectrography
  • Speech
  • Speech Perception*
  • Task Performance and Analysis