Auditory nerve representation of vowels in background noise

J Neurophysiol. 1983 Jul;50(1):27-45. doi: 10.1152/jn.1983.50.1.27.


Responses of auditory nerve fibers to steady-state vowels presented alone and in the presence of background noise were obtained from anesthetized cats. Representation of vowels based on average discharge rate and representation based primarily on phase-locked properties of responses are considered. Profiles of average discharge rate versus characteristic frequency (CF) ("rate-place" representation) can show peaks of discharge rate in the vicinity of formant frequencies when vowels are presented alone. These profiles change drastically in the presence of background noise, however. At moderate vowel and noise levels and signal/noise ratios of +9 dB, there are not peaks of rate near the second and third formant frequencies. In fact, because of two-tone suppression, rate to vowels plus noise is less than rate to noise alone for fibers with CFs above the first formant. Rate profiles measured over 5-ms intervals near stimulus onset show clear formant-related peaks at higher sound levels than do profiles measured over intervals later in the stimulus (i.e., in the steady state). However, in background noise, rate profiles at onset are similar to those in the steady state. Specifically, for fibers with CFs above the first formant, response rates to the noise are suppressed by the addition of the vowel at both vowel onset and steady state. When rate profiles are plotted for low spontaneous rate fibers, formant-related peaks appear at stimulus levels higher than those at which peaks disappear for high spontaneous fibers. In the presence of background noise, however, the low spontaneous fibers do not preserve formant peaks better than do the high spontaneous fibers. In fact, the suppression of noise-evoked rate mentioned above is greater for the low spontaneous fibers than for high. Representations that reflect phase-locked properties as well as discharge rate ("temporal-place" representations) are much less affected by background noise. We have used synchronized discharge rate averaged over fibers with CFs near (+/- 0.25 octave) a stimulus component as a measure of the population temporal response to that component. Plots of this average localized synchronized rate (ALSR) versus frequency show clear first and second formant peaks at all vowel and noise levels used. Except at the highest level (vowel at 85 dB sound pressure level (SPL), signal/noise = +9 dB), there is also a clear third formant peak. At signal-to-noise ratios where there are no second formant peaks in rate profiles, human observers are able to discriminate second formant shifts of less than 112 Hz. ALSR plots show clear second formant peaks at these signal/noise ratios.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Cats
  • Evoked Potentials, Auditory
  • Loudness Perception / physiology
  • Nerve Fibers / physiology
  • Noise*
  • Phonetics*
  • Pitch Perception / physiology
  • Psychoacoustics
  • Speech Perception / physiology*
  • Vestibulocochlear Nerve / physiology*