Top-down attention regulates the neural expression of audiovisual integration

Neuroimage. 2015 Oct 1:119:272-85. doi: 10.1016/j.neuroimage.2015.06.052. Epub 2015 Jun 26.

Abstract

The interplay between attention and multisensory integration has proven to be a difficult question to tackle. There are almost as many studies showing that multisensory integration occurs independently from the focus of attention as studies implying that attention has a profound effect on integration. Addressing the neural expression of multisensory integration for attended vs. unattended stimuli can help disentangle this apparent contradiction. In the present study, we examine if selective attention to sound pitch influences the expression of audiovisual integration in both behavior and neural activity. Participants were asked to attend to one of two auditory speech streams while watching a pair of talking lips that could be congruent or incongruent with the attended speech stream. We measured behavioral and neural responses (fMRI) to multisensory stimuli under attended and unattended conditions while physical stimulation was kept constant. Our results indicate that participants recognized words more accurately from an auditory stream that was both attended and audiovisually (AV) congruent, thus reflecting a benefit due to AV integration. On the other hand, no enhancement was found for AV congruency when it was unattended. Furthermore, the fMRI results indicated that activity in the superior temporal sulcus (an area known to be related to multisensory integration) was contingent on attention as well as on audiovisual congruency. This attentional modulation extended beyond heteromodal areas to affect processing in areas classically recognized as unisensory, such as the superior temporal gyrus or the extrastriate cortex, and to non-sensory areas such as the motor cortex. Interestingly, attention to audiovisual incongruence triggered responses in brain areas related to conflict processing (i.e., the anterior cingulate cortex and the anterior insula). Based on these results, we hypothesize that AV speech integration can take place automatically only when both modalities are sufficiently processed, and that if a mismatch is detected between the AV modalities, feedback from conflict areas minimizes the influence of this mismatch by reducing the processing of the least informative modality.

Keywords: Attention; Audiovisual; Multisensory; STS; Speech perception; fMRI.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Acoustic Stimulation
  • Adult
  • Attention / physiology*
  • Brain / physiology*
  • Brain Mapping
  • Female
  • Humans
  • Magnetic Resonance Imaging
  • Male
  • Photic Stimulation
  • Pitch Perception / physiology*
  • Speech Perception / physiology*
  • Visual Perception / physiology*
  • Young Adult