Speech comprehension aided by multiple modalities: behavioural and neural interactions

Neuropsychologia. 2012 Apr;50(5):762-76. doi: 10.1016/j.neuropsychologia.2012.01.010. Epub 2012 Jan 17.

Abstract

Speech comprehension is a complex human skill, the performance of which requires the perceiver to combine information from several sources - e.g. voice, face, gesture, linguistic context - to achieve an intelligible and interpretable percept. We describe a functional imaging investigation of how auditory, visual and linguistic information interact to facilitate comprehension. Our specific aims were to investigate the neural responses to these different information sources, alone and in interaction, and further to use behavioural speech comprehension scores to address sites of intelligibility-related activation in multifactorial speech comprehension. In fMRI, participants passively watched videos of spoken sentences, in which we varied Auditory Clarity (with noise-vocoding), Visual Clarity (with Gaussian blurring) and Linguistic Predictability. Main effects of enhanced signal with increased auditory and visual clarity were observed in overlapping regions of posterior STS. Two-way interactions of the factors (auditory × visual, auditory × predictability) in the neural data were observed outside temporal cortex, where positive signal change in response to clearer facial information and greater semantic predictability was greatest at intermediate levels of auditory clarity. Overall changes in stimulus intelligibility by condition (as determined using an independent behavioural experiment) were reflected in the neural data by increased activation predominantly in bilateral dorsolateral temporal cortex, as well as inferior frontal cortex and left fusiform gyrus. Specific investigation of intelligibility changes at intermediate auditory clarity revealed a set of regions, including posterior STS and fusiform gyrus, showing enhanced responses to both visual and linguistic information. Finally, an individual differences analysis showed that greater comprehension performance in the scanning participants (measured in a post-scan behavioural test) were associated with increased activation in left inferior frontal gyrus and left posterior STS. The current multimodal speech comprehension paradigm demonstrates recruitment of a wide comprehension network in the brain, in which posterior STS and fusiform gyrus form sites for convergence of auditory, visual and linguistic information, while left-dominant sites in temporal and frontal cortex support successful comprehension.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Acoustic Stimulation / methods
  • Adolescent
  • Adult
  • Analysis of Variance
  • Brain Mapping*
  • Comprehension / physiology*
  • Female
  • Functional Laterality
  • Humans
  • Image Processing, Computer-Assisted
  • Linguistics
  • Magnetic Resonance Imaging
  • Male
  • Normal Distribution
  • Oxygen / blood
  • Pattern Recognition, Visual / physiology*
  • Speech / physiology*
  • Speech Perception / physiology*
  • Temporal Lobe / blood supply
  • Temporal Lobe / physiology*
  • Young Adult

Substances

  • Oxygen