Can visual capture of sound separate auditory streams?

Chiara Valzolgher; Elena Giovanelli; Roberta Sorio; Giuseppe Rabini; Francesco Pavani

doi:10.1007/s00221-021-06281-8

Can visual capture of sound separate auditory streams?

Exp Brain Res. 2022 Mar;240(3):813-824. doi: 10.1007/s00221-021-06281-8. Epub 2022 Jan 20.

Authors

Chiara Valzolgher^{1

2}, Elena Giovanelli³, Roberta Sorio⁴, Giuseppe Rabini³, Francesco Pavani^{3

5

4}

Affiliations

¹ Center for Mind/Brain Sciences (CIMeC), University of Trento, Rovereto, Italy. chiara.valzolgher@unitn.it.
² Integrative, Multisensory, Perception, Action and Cognition Team (IMPACT), Centre de Recherche en Neuroscience de Lyon (CRNL) Inserm U1028, CNRS UMR5292, Bâtiment Inserm 16 avenue Doyen Lépine, 69676, Bron, France. chiara.valzolgher@unitn.it.
³ Center for Mind/Brain Sciences (CIMeC), University of Trento, Rovereto, Italy.
⁴ Department of Psychology and Cognitive Sciences (DiPSCo), University of Trento, Rovereto, Italy.
⁵ Integrative, Multisensory, Perception, Action and Cognition Team (IMPACT), Centre de Recherche en Neuroscience de Lyon (CRNL) Inserm U1028, CNRS UMR5292, Bâtiment Inserm 16 avenue Doyen Lépine, 69676, Bron, France.

PMID: 35048159
DOI: 10.1007/s00221-021-06281-8

Abstract

In noisy contexts, sound discrimination improves when the auditory sources are separated in space. This phenomenon, named Spatial Release from Masking (SRM), arises from the interaction between the auditory information reaching the ear and spatial attention resources. To examine the relative contribution of these two factors, we exploited an audio-visual illusion in a hearing-in-noise task to create conditions in which the initial stimulation to the ears is held constant, while the perceived separation between speech and masker is changed illusorily (visual capture of sound). In two experiments, we asked participants to identify a string of five digits pronounced by a female voice, embedded in either energetic (Experiment 1) or informational (Experiment 2) noise, before reporting the perceived location of the heard digits. Critically, the distance between target digits and masking noise was manipulated both physically (from 22.5 to 75.0 degrees) and illusorily, by pairing target sounds with visual stimuli either at same (audio-visual congruent) or different positions (15 degrees offset, leftward or rightward: audio-visual incongruent). The proportion of correctly reported digits increased with the physical separation between the target and masker, as expected from SRM. However, despite effective visual capture of sounds, performance was not modulated by illusory changes of target sound position. Our results are compatible with a limited role of central factors in the SRM phenomenon, at least in our experimental setting. Moreover, they add to the controversial literature on the limited effects of audio-visual capture in auditory stream separation.

Keywords: Hearing in noise; Sound localization; Spatial release from masking; Visual capture of sound.

MeSH terms

Acoustic Stimulation
Female
Hearing
Humans
Noise
Perceptual Masking*
Speech
Speech Perception*

Abstract

MeSH terms

Grants and funding