Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2017 Mar 31:8:445.
doi: 10.3389/fpsyg.2017.00445. eCollection 2017.

Assessing the Role of the 'Unity Assumption' on Multisensory Integration: A Review

Affiliations
Review

Assessing the Role of the 'Unity Assumption' on Multisensory Integration: A Review

Yi-Chuan Chen et al. Front Psychol. .

Abstract

There has been longstanding interest from both experimental psychologists and cognitive neuroscientists in the potential modulatory role of various top-down factors on multisensory integration/perception in humans. One such top-down influence, often referred to in the literature as the 'unity assumption,' is thought to occur in those situations in which an observer considers that various of the unisensory stimuli that they have been presented with belong to one and the same object or event (Welch and Warren, 1980). Here, we review the possible factors that may lead to the emergence of the unity assumption. We then critically evaluate the evidence concerning the consequences of the unity assumption from studies of the spatial and temporal ventriloquism effects, from the McGurk effect, and from the Colavita visual dominance paradigm. The research that has been published to date using these tasks provides support for the claim that the unity assumption influences multisensory perception under at least a subset of experimental conditions. We then consider whether the notion has been superseded in recent years by the introduction of priors in Bayesian causal inference models of human multisensory perception. We suggest that the prior of common cause (that is, the prior concerning whether multisensory signals originate from the same source or not) offers the most useful way to quantify the unity assumption as a continuous cognitive variable.

Keywords: coupling priors; crossmodal correspondences; semantic congruency; the unity assumption; the unity effect.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
Welch and Warren’s (1980) early model of multisensory interactions concerning those situations in which “intersensory bias would occur.” The first stage which pertains to stimulus situation includes the descriptive characteristics of the signals that will be received by multiple sensory systems (i.e., these are so-called ‘amodal’ features), and the observer’s current goal. Notice here that spatial and temporal coincidences were listed at this first stage. This constrains what goes on at later stages of information processing. The second stage, modality characteristics, determines how the sensory signals are received and represented, such as that the shape of a 3-D object is perceived as a 2-D visual array initially by the visual system, but its surface and edge are perceived by the cutaneous and proprioceptive systems. The third stage, observer processes, concerns how human brains process/integrate the information from different modalities in order to fit the task goal. The general historical factors refer to the long-term likelihood that the information from different sensory modalities should go together; by contrast, the specific historical factors refer to the observer’s past experience regarding a particular stimulus pair, such as one’s pet dog and its unique barks, should undoubtedly go together. The model also suggests that the observer’s attention is primarily allocated to the modality that is typically most appropriate to the current task, such as vision in spatial tasks and audition in temporal tasks. Nevertheless, experimenter’s instruction or task demands may leads to the shift of attention to another sensory modality (i.e., secondary attention). The unity assumption factor listed at this stage is the main interest of the current review paper. These serial processes lead to a perceptual outcome that either the discrepant information from different sensory modalities is integrated, so intersensory bias would be observed, or, instead, the discrepant information is represented separately, so the discrepancy between the two sensory stimuli would be detected. Back in the 1980s, feed-forward models were the predominant view given the popular and rapidly developing approach of computational neural models. Nowadays, of course, we realize that feedback may be just as, if not even more, important (e.g., Talsma et al., 2010). This figure is reproduced from Figure 1 in Welch and Warren (1980).
FIGURE 2
FIGURE 2
A schematic figure using two dimensions to represent the relationships between the three top–down modulatory factors in studies of multisensory perception: the unity assumption, crossmodal correspondences, and semantic congruency. The X-axis highlights the fact that crossmodal correspondences typically constitute relative, rather than absolute, mappings between modality-specific dimensions (such as higher pitch and larger size, see Gallace and Spence, 2006), while semantically congruent stimuli refer to those features/attributes mapped to a common object or category. The Y-axis represents the spatiotemporal disparity between the stimuli. The effects of crossmodal correspondences and semantic congruency often occur between stimuli in a larger temporal disparity over hundreds of ms that are represented as two distinct events, such as the studies demonstrating crossmodal semantic priming (Chen and Spence, 2011b, 2013). The unity effect attributable to crossmodal correspondences or semantic congruency has, though, only been observed when the stimuli were presented in a range within 100 ms (Vatakis and Spence, 2007; Parise and Spence, 2009).
FIGURE 3
FIGURE 3
Examples of stimuli and the results of Parise and Spence’s (2009) study of the unity effect using in a temporal order judgment (TOJ) task. (A) The crossmodal correspondences between visual size and auditory pitch. (B) The results demonstrated that it was harder for participants to correctly judge the presentation order of a visual and auditory stimulus (i.e., the just noticeable difference (JND) was significantly higher) when the stimuli were congruent than incongruent.
FIGURE 4
FIGURE 4
The experimental setting and results of Kanaya and Yokosawa’s (2011) study looking at the McGurk effect in a spatial ventriloquism paradigm. (A) The experimental setting of the study. Two human faces were presented side-by-side, one was intact and the other was masked. Two speakers were put below and aligned with the location of the faces. (B) The results for the auditory stimulus pa. In the incongruent condition (i.e., hearing pa but seeing ka) led to the perception of ta (i.e., the McGurk effect). In this case, spatial ventriloquism may still occur; that is, the sound localization performance was less accurate when it was presented at the masked side than the visible side. (C) The results of auditory stimulus ka. In the incongruent condition (with visual stimulus pa), no McGurk effect would occur; in this case, spatial ventriloquism did not occur either. That is, sound localization performance was similar when it was presented at the masked or visible side. (B,C) reproduced from Kanaya and Yokosawa (2011) with data provided by the authors.
FIGURE 5
FIGURE 5
Results of Wallace et al.’s (2004) study of the unity effect and the spatial ventriloquism effect. (A) In the unification judgment task (i.e., judging whether the visual and auditory stimuli were presented from the same or different locations), the proportion of unification judgments decreased when either spatial or temporal disparity increased. (B) Hypothesis 1 suggests that the judgment of auditory localization occurs after whether visual and auditory signals were integrated (i.e., unified) in the spatial domain. That is, the spatial ventriloquism effect results from a unified percept. The auditory localization should be quite accurate if the visual and auditory signals were not integrated. (C) Hypothesis 2 suggests that the unification judgment is determined by the perceived location of each visual and auditory signal. That is, the visual and auditory inputs would be judged as unified if they happened to be perceived at the same location. (A) Reproduced from the data provided in Wallace et al. (2004).
FIGURE 6
FIGURE 6
An example of stimuli and results of Vatakis and colleagues’ experiments of the unity effect on temporal perception (Vatakis and Spence, 2007, 2008; Vatakis et al., 2008). (A) An example of the video and audio used in the studies – they were either matched or not in terms of gender. (B) The results demonstrated that it was harder for participants to correctly judge the presentation order of the video and audio (i.e., the JND, was significantly higher) when the stimuli were matched than mismatched. (A) Reprinted from Vatakis and Spence (2008) with permission from the authors.
FIGURE 7
FIGURE 7
Number of published articles including the word of “unity effect” (or “unity assumption”) and “coupling prior” in the title, abstract, or in the text of papers listed in Google Scholar in the past 20 years. The use of both terms has been rising slowly but surely in recent years, thus arguing against those claim that the notion of the unity effect/assumption has fallen out of fashion in recent years.

Similar articles

Cited by

References

    1. Adam R., Noppeney U. (2010). Prior auditory information shapes visual category-selectivity in ventral occipito-temporal cortex. NeuroImage 52 1592–1602. 10.1016/j.neuroimage.2010.05.002 - DOI - PubMed
    1. Alais D., Burr D. (2004). The ventriloquist effect results from near-optimal bimodal integration. Curr. Biol. 14 257–262. 10.1016/j.cub.2004.01.029 - DOI - PubMed
    1. Alais D., Carlile S. (2005). Synchronizing to real events: Subjective audiovisual alignment scales with perceived auditory depth and speed of sound. Proc. Natl. Acad. Sci. U.S.A. 102 2244–2247. 10.1073/pnas.0407034102 - DOI - PMC - PubMed
    1. Arnold D. H., Johnston A., Nishida S. (2005). Timing sight and sound. Vis. Res. 45 1275–1284. 10.1016/j.visres.2004.11.014 - DOI - PubMed
    1. Bar M. (2007). The proactive brain: using analogies and associations to generate predictions. Trends Cogn. Sci. 11 280–289. 10.1016/j.tics.2007.05.005 - DOI - PubMed

LinkOut - more resources