Left Superior Temporal Gyrus Is Coupled to Attended Speech in a Cocktail-Party Auditory Scene

J Neurosci. 2016 Feb 3;36(5):1596-606. doi: 10.1523/JNEUROSCI.1730-15.2016.


Using a continuous listening task, we evaluated the coupling between the listener's cortical activity and the temporal envelopes of different sounds in a multitalker auditory scene using magnetoencephalography and corticovocal coherence analysis. Neuromagnetic signals were recorded from 20 right-handed healthy adult humans who listened to five different recorded stories (attended speech streams), one without any multitalker background (No noise) and four mixed with a "cocktail party" multitalker background noise at four signal-to-noise ratios (5, 0, -5, and -10 dB) to produce speech-in-noise mixtures, here referred to as Global scene. Coherence analysis revealed that the modulations of the attended speech stream, presented without multitalker background, were coupled at ∼0.5 Hz to the activity of both superior temporal gyri, whereas the modulations at 4-8 Hz were coupled to the activity of the right supratemporal auditory cortex. In cocktail party conditions, with the multitalker background noise, the coupling was at both frequencies stronger for the attended speech stream than for the unattended Multitalker background. The coupling strengths decreased as the Multitalker background increased. During the cocktail party conditions, the ∼0.5 Hz coupling became left-hemisphere dominant, compared with bilateral coupling without the multitalker background, whereas the 4-8 Hz coupling remained right-hemisphere lateralized in both conditions. The brain activity was not coupled to the multitalker background or to its individual talkers. The results highlight the key role of listener's left superior temporal gyri in extracting the slow ∼0.5 Hz modulations, likely reflecting the attended speech stream within a multitalker auditory scene.

Significance statement: When people listen to one person in a "cocktail party," their auditory cortex mainly follows the attended speech stream rather than the entire auditory scene. However, how the brain extracts the attended speech stream from the whole auditory scene and how increasing background noise corrupts this process is still debated. In this magnetoencephalography study, subjects had to attend a speech stream with or without multitalker background noise. Results argue for frequency-dependent cortical tracking mechanisms for the attended speech stream. The left superior temporal gyrus tracked the ∼0.5 Hz modulations of the attended speech stream only when the speech was embedded in multitalker background, whereas the right supratemporal auditory cortex tracked 4-8 Hz modulations during both noiseless and cocktail-party conditions.

Keywords: coherence analysis; magnetoencephalography; speech in noise.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Acoustic Stimulation / methods*
  • Adult
  • Attention / physiology*
  • Auditory Cortex / physiology*
  • Female
  • Humans
  • Magnetic Resonance Imaging / methods
  • Magnetoencephalography / methods
  • Male
  • Speech Perception / physiology*
  • Temporal Lobe / physiology*
  • Young Adult