Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
, 229 (1-2), 116-31

The Role of Auditory Cortex in the Formation of Auditory Streams

Affiliations
Review

The Role of Auditory Cortex in the Formation of Auditory Streams

Christophe Micheyl et al. Hear Res.

Abstract

Auditory streaming refers to the perceptual parsing of acoustic sequences into "streams", which makes it possible for a listener to follow the sounds from a given source amidst other sounds. Streaming is currently regarded as an important function of the auditory system in both humans and animals, crucial for survival in environments that typically contain multiple sound sources. This article reviews recent findings concerning the possible neural mechanisms behind this perceptual phenomenon at the level of the auditory cortex. The first part is devoted to intra-cortical recordings, which provide insight into the neural "micromechanisms" of auditory streaming in the primary auditory cortex (A1). In the second part, recent results obtained using functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG) in humans, which suggest a contribution from cortical areas other than A1, are presented. Overall, the findings concur to demonstrate that many important features of sequential streaming can be explained relatively simply based on neural responses in the auditory cortex.

Figures

Figure 1
Figure 1
Schematic representation of the stimuli commonly used to study sequential auditory streaming and of the corresponding auditory percepts. The stimulus (left) is a temporal sequence of pure tones alternating between two frequencies, represented here and in the text by the letters A and B. The A-B frequency difference (ΔF) is either small (top left panel) or large (bottom left panel). In the former case, the percept is that of a single, coherent stream of tones alternating in pitch. In the latter case, the percept is that of two separate streams of tones; since the tones in each stream have a constant frequency, the sense of pitch alternation is lost.
Figure 2
Figure 2
Example psychometric functions measured in two human subjects and showing how the probability that a repeating sequence of tone triplets (ABA) is perceived as “two streams” varies as a function of time since sequence onset, for different A-B frequency separations. In order to obtain these functions, listeners were presented 20 times with the same 10-s sequence of ABA triplets and they were instructed to report their initial percept (immediately after sequence onset) and any subsequent change in percept (from one to two streams or vice versa), throughout the 10 s. Thus, at a given instant, the response was binary, i.e., one stream or two streams. The probabilities of “two streams” responses were estimated as the measured proportion of trials (out of the 20 trials per condition per listener) on which the listener's percept was that of two separate streams (at the considered instant), averaged across the two listeners. Data corresponding to different frequency separations, from 1 to 9 semitones (ST) are represented by different colors, as indicated in the legend.
Fig. 3
Fig. 3
Post-stimulus time histograms (PSTHs) of neural responses in A1 to the first and last ABA tone-triplets in 20-triplet (10-second overall duration) sequences. The top row shows average PSTHs across 30 A1 units; the bottom row shows the responses from a single A1 unit.
Fig. 4
Fig. 4
Number of spikes evoked by the A and B tones in ABA-triplet sequences as a function of time. The three panels correspond to the three tones in each triplet; from left to right: 1st (i.e., leading) A tone, B tone, and 2nd (i.e., trailing) A tone. Each data point corresponds to a triplet. The values along the X-axis indicate the onset time of the triplet of which the considered tone was part.
Fig. 5
Fig. 5
Comparison between psychometric streaming functions measured in humans and neurometric functions computed based on neural responses measured in monkey A1. The neurometric predictions are shown as solid lines. The psychometric functions from Fig. 2, are re-plotted here using dashed lines in order to facilitate comparison with their neurometric equivalent. (A) Best-fitting predictions derived by combining spike-count information from 30 units. (B) Best-fitting predictions obtained by using spike-count information from a single A1 unit. (C) Best-fitting predictions obtained by combining spike-count information from 30 units, but using as decision variable the difference between the spikes counts evoked by the A and B tones in each triplet.
Fig. 6
Fig. 6
Average source waveforms from the auditory cortex of 14 listeners in response to ABA tone triplets with different A-B frequency separations (indicated in semitones on the left; they increase from top to bottom), and for a control condition in which the B tone was replaced by a silent gap (bottom trace). The P1m and N1m peaks evoked by the trailing A tone are labeled on the bottom trace. The three tones are represented schematically at their respective temporal positions underneath the traces. Each tone was 100 ms (including 10 ms ramps), the offset-to-onset interval-tone interval within each triplet, 50 ms, and the inter-triplet interval, 200 ms. The source waveforms shown here reflect activity originating from dipoles located in or near Heschl's gyrus (right and left hemispheres), as determined by spatiotemporal dipole source analysis (Scherg et al., 1990). Further methodological details can be found in Gutschalk et al. (2005). The main difference between the experiments described in that earlier study and that reported here is that, here, the A frequency was fixed and the B frequency variable – whereas the converse was true in Gutschalk et al. (2005).
Fig. 7
Fig. 7
(A) Auditory cortex activation in response to repeating ABAB sequences for different A-B frequency separations. The activation is shown overlaid on a 3D reconstruction of the superior temporal lobe obtained from T1-weighted (∼1 × 1 × 1 mm resolution) images (MRPAGE). Stronger activation, corresponding to lower statistical p values, appears in bright yellow; weaker (albeit statistically significant) activation, in red. The data shown here are from a single listener, but typical of those obtained in a larger sample (Wilson et al., 2005). (B) Time courses of activation in auditory cortex for the tow extreme A-B frequency separations: 0 and 20 semitones.

Similar articles

See all similar articles

Cited by 67 PubMed Central articles

See all "Cited by" articles

Publication types

Feedback