Audiovisual speech perception: Moving beyond McGurk

J Acoust Soc Am. 2022 Dec;152(6):3216. doi: 10.1121/10.0015262.

Abstract

Although it is clear that sighted listeners use both auditory and visual cues during speech perception, the manner in which multisensory information is combined is a matter of debate. One approach to measuring multisensory integration is to use variants of the McGurk illusion, in which discrepant auditory and visual cues produce auditory percepts that differ from those based on unimodal input. Not all listeners show the same degree of susceptibility to the McGurk illusion, and these individual differences are frequently used as a measure of audiovisual integration ability. However, despite their popularity, we join the voices of others in the field to argue that McGurk tasks are ill-suited for studying real-life multisensory speech perception: McGurk stimuli are often based on isolated syllables (which are rare in conversations) and necessarily rely on audiovisual incongruence that does not occur naturally. Furthermore, recent data show that susceptibility to McGurk tasks does not correlate with performance during natural audiovisual speech perception. Although the McGurk effect is a fascinating illusion, truly understanding the combined use of auditory and visual information during speech perception requires tasks that more closely resemble everyday communication: namely, words, sentences, and narratives with congruent auditory and visual speech cues.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Acoustic Stimulation
  • Auditory Perception
  • Humans
  • Illusions*
  • Language
  • Photic Stimulation
  • Speech
  • Speech Perception*
  • Visual Perception