Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jan;23(1):113-121.
doi: 10.1038/s41593-019-0544-7. Epub 2019 Dec 2.

Binocular viewing geometry shapes the neural representation of the dynamic three-dimensional environment

Affiliations

Binocular viewing geometry shapes the neural representation of the dynamic three-dimensional environment

Kathryn Bonnen et al. Nat Neurosci. 2020 Jan.

Abstract

Sensory signals give rise to patterns of neural activity, which the brain uses to infer properties of the environment. For the visual system, considerable work has focused on the representation of frontoparallel stimulus features and binocular disparities. However, inferring the properties of the physical environment from retinal stimulation is a distinct and more challenging computational problem-this is what the brain must actually accomplish to support perception and action. Here we develop a computational model that incorporates projective geometry, mapping the three-dimensional (3D) environment onto the two retinae. We demonstrate that this mapping fundamentally shapes the tuning of cortical neurons and corresponding aspects of perception. For 3D motion, the model explains the strikingly non-canonical tuning present in existing electrophysiological data and distinctive patterns of perceptual errors evident in human behavior. Decoding the world from cortical activity is strongly affected by the geometry that links the environment to the sensory epithelium.

PubMed Disclaimer

Conflict of interest statement

Competing Interests

The authors declare no competing interests.

Figures

Figure 1:
Figure 1:. MT neurons exhibit atypical “terraced” tuning structure for environmental velocities in 3D.
a. For the purposes of this study, 3D motion refers to velocities which fall on the xz plane. This allows us to unwrap the motion direction onto a linear axis (as is typically done with frontoparallel motion): right, away, left, toward, right. b. Average neural response to 3D (xz) motion direction for 6 example neurons in macaque MT [4]. Each panel depicts the average response of a single example neuron to the presentation of different 3D motion directions (black dots). Predictions of the model proposed here are plotted for comparison (purple). Stimuli consisted of binocular presentations of motions consistent with a wide array of directions in the x-z axes (fully crossed manipulation of retinal velocities in the two eyes: −10°/s, −2°/s, −1°/s, 1°/s, 2°/s, 10°/s). This results in motions presented in 28 unique directions (of varying environmental speeds), with each of the three cardinal directions (right, away, left, toward) repeated at 3 different speeds. These motion stimuli were presented at 6 different grating orientations (0°, 30°, 60°, 90°, 120°, 150°), all drifting orthogonal to grating orientation. Each stimulus was repeated 25 times. In the examples here, we have plotted the data from the vertically-oriented grating orientation. For the purposes of our analyses we included all data except those collected using the horizontally-oriented grating, which doesn’t have a proper binocular velocity signal. Additional details about these experiments can be found in the original paper. [4].
Figure 2:
Figure 2:. An encoding model that incorporates the environment-to-retina geometry of 3D motion predicts atypical structures for binocular 3D motion tuning curves.
a. Diagram of the projection of 3D motion (confined to the xz-plane; middle panel) onto the left eye (blue; left panel) and the right eye (red; right panel). The color wheels in the middle panel identify 16 xz-directions and those directions are also marked on the retinal velocity panels for the left and right eye. For simplicity, velocities are plotted in a world-motion reference frame, i.e., leftward motion in the world is also plotted as ‘leftward’ in retinal velocity panels. The assumption that the ocular axes are 90°apart results in an effective viewing distance of 12 * interpupillary distance. b. Left and right eye retinal velocities as a function of 3D motion direction. These are replotted from the left and right eye panels in a. c-e. Each row represents an example model neuron generated from fits to 3 neurons found in [4]. c. A 3D model neuron that exhibits slight ocular dominance, leftward preference, and is direction selective. (Left panel) Monocular retinal velocity tuning curves for the left and right eye. (Middle panel) Monocular neural responses as a function of 3D motion direction, built from the composition of the functions depicted in b and panel i. (Right panel) Binocular 3D motion direction tuning curve computed from a weighted linear combination of monocular responses in the middle panel. Data points (circles) trace the transformation of a single 3D direction from b through all three panels in c. d. A 3D model neuron that exhibits strong ocular dominance, rightward preference, and is direction selective. e. A 3D model neuron that exhibits rightward preference, and is less direction selective.
Figure 3:
Figure 3:. A 3D model decoder successfully estimates 3D motion direction.
However, the resulting pattern of estimates is distinct from an idealized Gaussian (von Mises) model. a. Binocular tuning curves from the computational model for decoding 3D motion direction, assuming a viewing distance of 12 * interpupillary distance. These 16 example 3D direction tuning curves were chosen because their preferred direction (as calculated by the vector average) were closest to tiling 3D direction with 16 evenly spaced values in the xz-plane (0°, 22.5°, 45°, … , 337.5°). b. The decoder successfully estimates 3D motion direction; estimates (dots) fall on the unity line (dashed white line). c. The mean estimation error (purple line) and standard deviation (purple cloud) are plotted as a function of 3D direction (n=36000; 100 independent estimates per 360 directions tested). The standard deviation of the estimates (purple cloud) varies cyclically as a function of the motion direction presented. This is a consequence of the binocular projective geometry. d. For comparison to a-c: Idealized population of neurons with Gaussian tuning for 3D motion direction. Here we show 16 evenly spaced Gaussian tuning curves (with preferred directions: 0°, 22.5°, 45°, … , 337.5°); 236 evenly spaced neurons were used in the simulated population. This matches the number of neurons in the recorded population, and simulated in the computational model. e. Gaussian decoder successfully estimates 3D motion direction; estimates (purple dots) fall on the unity line (dashed white line). f. The mean estimation error (purple line) and standard deviation of estimates (purple cloud) are plotted as function of 3D direction (n=36000; 100 independent estimates per 360 directions tested). Note that the standard deviation of the estimation error does not vary as a function of the motion direction presented (compare to 3c.)
Figure 4:
Figure 4:. Model estimates of 3D motion direction change with viewing distance, resulting in surprising model errors at far viewing distances.
a. At a larger (67cm) viewing distance, the retinal velocities are smaller in magnitude and the difference between the left and right eye retinal velocities is drastically reduced. b. The effect of increased viewing distance on individual tuning curves is a convergence of steep transitions on the toward/away motion directions. This results in a relatively symmetrical function except close to the toward and away directions. This symmetry is present across the whole population (because it is a lawful consequence of binocular projective geometry; e.g., c) and it leads to the unusual model errors evident in d. c. Binocular tuning curves for 3D motion direction at a viewing distance of 67cm. These 16 3D direction tuning curves are the same example units as those shown in figure 3a. d. Model estimates of 3D motion direction for a viewing distance of 67cm (n=15 per 72 directions tested). A pattern of biases and depth-sign errors emerges, forming a ‘X ‘ pattern of results.
Figure 5:
Figure 5:. Systematic biases for toward/away motion emerges with increased viewing distances.
a-d. Model performance for motion direction estimation for a single environmental speed (5cm/s) at four different viewing distances (3.25cm, 20cm, 31cm, 67cm). Colors indicate model estimates of environmental speed. The unity line (black) marks the presented motion directions. e-h. The same model and estimates as a-h, but plotted as a function of the corresponding left and right eye retinal velocities. Again the thick black line represents the presented motion. The dashed lines indicated the axes of toward/away motion and left/right motion. From this representation, it is evident that the variability around the retinal velocities is similarly shaped across viewing distances but that the transformation to the environmental velocity results in systematic differences in model estimation performance for environmental velocities at different viewing distances. i-l. The mapping from retinal velocities to environmental velocities at different viewing distances. Again the thick black line represents the presented motion.
Figure 6:
Figure 6:. Human performance on a 3D motion direction estimation task matches model observer performance
a-c. Results from a human psychophysics experiment. Three observers were shown dot motion clouds moving in one direction and asked to estimate the 3D motion direction. a. 3D motion direction estimation performance collapsed across 3 human observers at a 20 cm viewing distance. Each dot represents an estimate from a single trial (n=15 per 72 directions tested). Data points are rendered semi-transparently in order to make visible the density of estimates. b. 3D motion direction estimation performance collapsed across 3 human observers at a 31 cm viewing distance. c. 3D motion direction estimation performance collapsed across 3 human observers at a 67 cm viewing distance. d-f. 3D model performance estimating motion direction in the same conditions as the human observers in a-c. Notice that with the increased viewing distance there is an increase in the number of depth sign errors and a bias away from frontoparallel motion for both the model and the human observers. g. The percentage of depth sign errors as a function of viewing distance for the two models and 3 human observers, demonstrating that there is a categorical difference between the predictions made by the 3D model and the von Mises model. Human observers are clearly better matched by the 3D model.
Figure 7:
Figure 7:. Subtle tuning differences across the two eyes enable the toward-vs-away aspect of decoding for 3D motion direction.
Each lettered panel shows the performance of a decoder (upper), based upon a particular simulated neural population (lower) at a simulated viewing distance of ipd2 (as in Figure 3), given a particular set of tuning characteristics: a. the original tuning measured in this paper (slightly different across the two eyes for all parameters) b. equal monocular inputs from the two eyes c. differs across the two eyes only in response amplitude d. differs only in tuning bandwidth e. differs only in speed preference or f. differs only in baseline firing rate

Similar articles

Cited by

References

    1. von Helmholtz Hermann. Treatise on Physiological Optics. 1867.
    1. Gibson James Jerome. The Perception of the Visual World, 1950.
    1. Rokers Bas, Cormack Lawrence K, and Huk Alexander C. Disparity- and velocity-based signals for three-dimensional motion perception in human MT+. Nature Publishing Group, 12 (8):1050–1055, August 2009. - PubMed
    1. Czuba Thaddeus B, Huk Alexander C, Cormack Lawrence K, and Kohn Adam. Area MT encodes three-dimensional motion. The Journal of Neuroscience, 34(47):15522–15533, November 2014. - PMC - PubMed
    1. Sanada Takahisa M and DeAngelis Gregory C. Neural representation of motion-in-depth in area MT. The Journal of Neuroscience, 34(47):15508–15521, November 2014. - PMC - PubMed

Methods References

    1. Barendregt Martijn, Dumoulin Serge O, and Rokers Bas. Stereomotion scotomas occur after binocular combination. Vision research, 105:92–99, December 2014. - PubMed

Publication types