Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
, 371 (1697)

Weighted Parallel Contributions of Binocular Correlation and Match Signals to Conscious Perception of Depth

Affiliations
Review

Weighted Parallel Contributions of Binocular Correlation and Match Signals to Conscious Perception of Depth

Ichiro Fujita et al. Philos Trans R Soc Lond B Biol Sci.

Abstract

Binocular disparity is detected in the primary visual cortex by a process similar to calculation of local cross-correlation between left and right retinal images. As a consequence, correlation-based neural signals convey information about false disparities as well as the true disparity. The false responses in the initial disparity detectors are eliminated at later stages in order to encode only disparities of the features correctly matched between the two eyes. For a simple stimulus configuration, a feed-forward nonlinear process can transform the correlation signal into the match signal. For human observers, depth judgement is determined by a weighted sum of the correlation and match signals rather than depending solely on the latter. The relative weight changes with spatial and temporal parameters of the stimuli, allowing adaptive recruitment of the two computations under different visual circumstances. A full transformation from correlation-based to match-based representation occurs at the neuronal population level in cortical area V4 and manifests in single-neuron responses of inferior temporal and posterior parietal cortices. Neurons in area V5/MT represent disparity in a manner intermediate between the correlation and match signals. We propose that the correlation and match signals in these areas contribute to depth perception in a weighted, parallel manner.This article is part of the themed issue 'Vision in our three-dimensional world'.

Keywords: binocular disparity; correspondence problem; random-dot stereogram; reversed depth perception; stereopsis; three-dimensional perception.

Figures

Figure 1.
Figure 1.
The stereo correspondence problem, and three types of RDSs. To infer the three-dimensional structure of the environment, the visual system needs to match correctly the features in the images received by the left and right retinae. The process of solving this correspondence problem can be defined as finding the match globally consistent across the visual field (global match) while rejecting false matches (blue dots) that do not belong to the global solution. Contrast-matched or correlated RDSs (cRDSs) have one global match, whereas contrast-reversed or anti-correlated RDSs (aRDSs) have none. hmRDSs, in which half of the dots are contrast-reversed, provide a half-match condition. hmRDSs are as a whole uncorrelated between the two eye images, because the positive correlation of contrast-matched dots and negative correlation of contrast-reversed dots cancel each other out.
Figure 2.
Figure 2.
Graded anti-correlation as a tool to dissociate binocular correlation-based and match-based computations. (a) Examples of contrast-matched, half-matched and contrast-reversed RDSs (cRDS, hmRDS, aRDS, respectively). The RDSs consist of a centre disc and a surrounding annulus. The annulus is always a cRDS. (b) We reverse the luminance contrast of a varying proportion of dots in one eye. This manipulation changes the binocular match from 100% (cRDS) through 50% (hmRDS) to 0% (aRDS), while binocular correlation changes from 100% (cRDS) through 0% (hmRDS) to −100% (aRDS). (c) Predicted psychophysical performance dissociates match-based computation from correlation-based computation. (Adapted from Doi et al. [30].)
Figure 3.
Figure 3.
Correlation and matching computations change their contribution depending on disparity magnitude and stimulus refresh rate. (a) Schematic diagram of weighted average of correlation and match signals that transforms bivariate signal (disparity sign and binocular match level) into a binary choice (near versus far). The process consists of four stages (encoding, subtraction, weighted average, binary decision). The relative contribution of the correlation computation for depth judgement is controlled by the parameter w. (b,c). Per cent correct data of human observers (open circles) and the functions (coloured curves) predicted from the model shown in (a) with only the parameter w fitted independently across five disparity magnitudes (b) and four refresh rates (c). Each data point is based on 60 choices, and error bars indicate s.e.m. across two blocks of trials. The two dashed curves above and below the solid curves are the hypothetical psychometric functions for pure matching computation (w = 0) and pure correlation computation (w = 1), respectively. (d) The impact of four free parameters in the model on the weighted-average psychometric functions. Only manipulation of the relative weight reproduces the psychophysical results. By contrast, the amplitude of the detectors does not shift the point of intersections with chance performance, and the upper and lower limits of the sigmoidal response function of the matching units do not shift the y-intercept (arrows). (Adapted from Doi et al. [30,37].)
Figure 4.
Figure 4.
Match-based depth perception depends on local correlation. (a) RDSs to manipulate local correlation while maintaining global correlation at zero. (i) An example RDS with negative dot pairing, in which a contrast-matched dot is paired with a contrast-reversed dot. (ii) An example RDS with positive dot pairing, in which a contrast-matched dot is vertically paired with one of the other contrast-matched dots. Pairs are made likewise between contrast-reversed dots. (b) Schematic diagrams of the dot pairing. Orange dots indicate contrast-matched dots; blue dots indicate contrast-reversed dots (in a pair or quadruplet of dots, the left-side and right-side dots indicate the luminance contrast for the left and right eyes, respectively). All RDSs contain the same number of contrast-matched and contrast-reversed dots, but the percentage and sign of dot pairing vary. (c) Statistics of binocular correlation within a small area of simulated RDSs across time (different patterns). (d) The per cent correct of near/far discrimination as a function of the percentage of paired dots. Error bars indicate the standard error across three subjects. (Adapted from Doi et al. [37].)
Figure 5.
Figure 5.
Possible mechanism of the encoding units in the weighted average model. For the correlation-based representation, local (spatially localized) disparity-energy units receive inputs from transient temporal channel and coarse spatial receptive field (RF). The outputs of the energy units are spatio-temporally pooled without additional nonlinearity. For the match-based representation, the sustained channel and fine RF are used instead of the transient channel and coarse RF. Additional nonlinearity transforms the local disparity-energy signals into match-based representation of disparity. Nonlinear process precedes pooling of responses across visual field.
Figure 6.
Figure 6.
Responses of V4 neurons to graded anti-correlation of RDSs. (a,b) Disparity tuning curves of two V4 neurons obtained with a set of gradually anti-correlated RDSs. Error bars indicate ±s.e.m., and dashed lines indicate ongoing discharge rate. L, R and U denote response levels to monocular left, right and uncorrelated RDSs, respectively. Negative sign on x-axis denotes crossed disparity. (c) Population readout of V4 neurons. Disparity tuning curves of 92 neurons were pooled for different correlation levels. The disparity tuning of far-preferring neurons were flipped around 0° disparity prior to pooling. Negative sign of binocular disparity indicates preferred disparity, not crossed disparity. Lower panels show heat-map plots of the single-neuron (a,b) or population (c) responses in a plane defined by binocular disparity and correlation level. (Adapted from Abdolrahmani et al. [86].)

Similar articles

See all similar articles

Cited by 2 articles

Publication types

LinkOut - more resources

Feedback