Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
, 17 (12), 16

Linking Normative Models of Natural Tasks to Descriptive Models of Neural Response

Affiliations
Review

Linking Normative Models of Natural Tasks to Descriptive Models of Neural Response

Priyank Jaini et al. J Vis.

Abstract

Understanding how nervous systems exploit task-relevant properties of sensory stimuli to perform natural tasks is fundamental to the study of perceptual systems. However, there are few formal methods for determining which stimulus properties are most useful for a given natural task. As a consequence, it is difficult to develop principled models for how to compute task-relevant latent variables from natural signals, and it is difficult to evaluate descriptive models fit to neural response. Accuracy maximization analysis (AMA) is a recently developed Bayesian method for finding the optimal task-specific filters (receptive fields). Here, we introduce AMA-Gauss, a new faster form of AMA that incorporates the assumption that the class-conditional filter responses are Gaussian distributed. Then, we use AMA-Gauss to show that its assumptions are justified for two fundamental visual tasks: retinal speed estimation and binocular disparity estimation. Next, we show that AMA-Gauss has striking formal similarities to popular quadratic models of neural response: the energy model and the generalized quadratic model (GQM). Together, these developments deepen our understanding of why the energy model of neural response have proven useful, improve our ability to evaluate results from subunit model fits to neural data, and should help accelerate psychophysics and neuroscience research with natural stimuli.

Figures

Figure 1
Figure 1
Linking scientific subfields. Perception science benefits when links are drawn between psychophysical and neuroscience studies of particular tasks, task-agnostic statistical procedures that fit models to data, and task-specific normative methods that determine which models are best. The current work develops formal links between the energy model for describing neural response, the generalized quadratic model (GQM) for fitting neural response, and AMA–Gauss for determining the neural response properties that best serve a particular task.
Figure 2
Figure 2
Computations of an energy model neuron, a GQM model neuron, and an AMA–Gauss likelihood neuron. All three have quadratic computations at their core. The energy model and the GQM describe the computations that neurons perform. AMA–Gauss prescribes the computations that neurons should perform to optimize performance in a specific task. (A) The standard energy model assumes two Gabor-shaped orthogonal subunit filters (receptive fields) fgbr to account for a neuron's response. The response of an energy model neuron RE is obtained by adding the squared responses of the filters. (B) The GQM fits multiple arbitrarily-shaped orthogonal subunit receptive fields f that best account for a neuron's response. The response of a GQM model neuron RGQM is obtained by pooling the squared (and linear, not shown) responses of the subunit filters via a weighted sum, and passing the sum through an output nonlinearity. (C) AMA–Gauss finds the optimal subunit filters fopt and quadratic pooling rules for a specific task. Unlike standard forms of the energy model and the GQM, AMA–Gauss incorporates contrast normalization and finds subunit filters that are not necessarily orthogonal. The response formula image of an AMA–Gauss likelihood neuron represents the likelihood of latent variable Xu. The likelihood is obtained by pooling the squared (and linear, not shown) subunit filter responses, indexed by i and j, via a weighted sum (Equation 20).
Figure 3
Figure 3
Accuracy maximization analysis. (A) Factors determining the optimal decoder used by AMA during filter learning. (B) Steps for finding optimal task-specific filters via AMA.
Figure 4
Figure 4
Relationship between conditional filter response distributions, likelihood, and posterior probability. Two hypothetical cases are considered, each with three latent variable values. (A) One-dimensional (i.e., single filter) Gaussian conditional response distributions, when information about the latent variable is carried only by the class-conditional mean; distribution means, but not variances, change with the latent variable. The blue distribution represents the response distribution to stimuli having latent variable value Xk. The red and green distributions represent response distributions to stimuli having different latent variables valuesformula image. The blue dot represents the likelihood formula image that observed noisy filter response R1(k, l) to stimulus sk,l was elicited by a stimulus having latent variable level Xu = Xk. Red and green dots represent the likelihoods that the response was elicited by a stimulus having latent variable formula image (i.e., by a stimulus having the incorrect latent variable value). (B) Posterior probability over the latent variable given the noisy observed response in (A). The posterior probability of the correct latent variable value (in this case, Xk) is given by the likelihood of the correct latent variable value normalized by the sum of all likelihoods. Colored boxes surrounding entries in the inset equation indicate the likelihood of each latent variable. (C) Two-dimensional (i.e., two-filter) Gaussian response distributions. Each ellipse represents the joint filter responses to all stimuli having the same latent variable value. The second filter improves decoding performance by selecting for useful stimulus features that the first filter does not. The black dot near the center of the blue ellipse represents an observed noisy joint response R(k, l) to stimulus sk,l. The likelihood formula image that the observed response was elicited by a stimulus having latent variable value Xu is obtained by evaluating the joint Gaussian at the noisy response; in this case, the product of the likelihoods represented by the blue dots on the single filter response distributions. (D–F) Same as A–C, but where information about the latent variable is carried by the class-conditional covariance instead of the mean; ellipse orientation, but not location, changes with the latent variable. AMA–Gauss finds the filters yielding conditional response distributions that are as different from each other as possible, given stimulus constraints.
Figure 5
Figure 5
Speed estimation task: filters, cost, compute time, and class-conditional response distributions. (A) AMA–Gauss and AMA filters for estimating speed (−8 to +8°/s) from natural image movies are nearly identical; ρ > 0.96 for all filters. (B) The cost for all the filters for both the models is identical. (C) Compute time for 50 evaluations of the posterior probability distribution is linear with AMA–Gauss, and quadratic with the full AMA model, in the training set size. (D) Same data as in (C) but on log–log axes. (E), (F) Joint filter responses, conditioned on each level of the latent variable, are approximately Gaussian. Different colors indicate different speeds. Individual symbols represent responses to individual stimuli. Thin black curves show that the filter response distributions, marginalized over the latent variable formula image, are heavier-tailed than Gaussians (see Results and Discussion).
Figure 6
Figure 6
Disparity estimation task: filters, cost, compute time, and class-conditional response distributions. (A) AMA–Gauss and AMA filters for estimating disparity from natural stereo-images (−15 to +15 arcmin). (B–F) Caption format same as Figure 5B–F.
Figure 7
Figure 7
Filter responses with and without contrast normalization. (A) Class-conditional filter response distributions formula image to contrast-normalized stimuli for each individual filter and each level of the latent variable in the speed estimation task. For visualization, responses are transformed to Z-scores by subtracting off the mean and dividing by the standard deviation. Gaussian probability density is overlaid for purposes of comparison. (B) Kurtosis of the two-dimensional conditional response distributions from filters 1 and 2 (violet; also see Figure 5E) and filters 3 and 4 (green; also see Figure 5F) for all levels of the latent variable. A two-dimensional Gaussian has a kurtosis of 8.0. Kurtosis was estimated by fitting a multidimensional generalized Gaussian via maximum likelihood methods. (C, D) Same as A, B but without contrast normalization. (E–H) Same as (A–D), but for the task of disparity estimation.
Figure 8
Figure 8
Decoding performance with and without contrast normalization. (A) Contrast normalization decreases decoding cost for the speed estimation task. (B) Percentage increase in cost without contrast normalization. (C, D) Same as (A, B) but for the disparity estimation task. The same result holds for different cost functions (e.g., squared error) and larger number of filters. If eight filters are used in the disparity task, failing to contrast normalize can decrease performance by ∼40%.
Figure 9
Figure 9
Quadratic pooling weights for computing the likelihood. Weights on squared and sum-squared filter responses (wii(X) and wij(X)) as a function of the latent variable. Weights on the linear filter responses are all approximately zero and are not shown. (A) Weights for speed estimation task. (B) Weights for disparity estimation task. Weights on squared responses are at maximum magnitude when the variance of the corresponding filter responses is at minimum. Weights on sum-squared responses are at maximum magnitude for latent variables yielding maximum response covariance (see Figures 5E, F and 6E, F).
Figure 10
Figure 10
Likelihood functions for speed and disparity tasks. (A) Likelihood functions for four randomly chosen natural image movies having true speeds of 4°/s. Each likelihood function represents the population response of the set of likelihood neurons, arranged by their preferred speeds. To ease the visual comparison, the likelihood functions are normalized to a constant volume by the sum of the likelihoods. (B) Same as (A), but for movies with a true speed of 0°/s. (C, D) Same as (A, B) but for stereo-images with −7.5 arcmin and 0.0 arcmin of disparity, respectively. (E) Tuning curves of speed-tuned likelihood neurons. For speeds sufficiently different from zero, tuning curves are approximately log-Gaussian and increase in width with speed. For near-zero speeds, tuning curves are approximately Gaussian. Each curve represents the mean response (i.e., tuning curve) of a likelihood neuron having a different preferred speed, normalized to a common maximum. Gray areas indicate 68% confidence intervals. (F) Tuning curves of disparity-tuned likelihood neurons.
Figure 11
Figure 11
Natural stimuli, artificial stimuli, and class-conditional responses. Many different retinal images are consistent with a given value of the task-relevant latent variable. These differences cause within-class (task-irrelevant) stimulus variation. Within-class stimulus variation is greater for natural stimuli than for typical artificial stimuli used in laboratory experiments. (A) Stimuli for speed estimation experiments. Two different example stimuli are shown for each stimulus type: natural stimuli (represented by cartoon line drawings), Gabor stimuli, and random-dot stimuli. Both example stimuli for each stimulus type drift at exactly the same speed, but create different retinal images. Natural stimuli cause more within-class retinal stimulus variation than artificial stimuli. (B) Same as (A), but for disparity. (C) Speed task: class-conditional responses to contrast-fixed 1.0 cpd drifting Gabors with random phase (speed task). Colors indicate different speeds. Ellipses represent filter responses to natural stimuli having the same speeds. (D) Disparity task: Class-conditional responses to contrast-fixed 1.5 cpd binocular Gabors with random phase. Class-conditional responses no longer have Gaussian structure, and instead have ring structure.
Figure A1
Figure A1
AMA vs. AMA–Gauss with simulated non-Gaussian class-conditional stimulus distributions. (A–C) Simulated stimuli from category 1 (red) and category 2 (blue). Each subplot plots two of the three stimulus dimensions against each other. All of the information for discriminating the categories is in the first two stimulus dimensions. AMA filters (black) and AMA–Gauss (gray) filters (which have the same dimensionality as the stimuli) are represented by arrows. AMA filters place all weight in the first two informative stimulus dimensions; that is, the two-dimensional subspace defined by the two AMA filters coincides with the most informative stimulus dimensions. AMA–Gauss filter weights are placed randomly across the three stimulus dimensions; that is, the two-dimensional subspace defined by the two AMA-Gauss filters will be random with respect to the most informative subspace. (D) Class-conditional AMA filter responses RAMA allow the categories to be discriminated. (E) Class-conditional AMA–Gauss filter responses RAMA–Gauss do not allow the categories to be discriminated.

Similar articles

See all similar articles

Cited by 5 articles

References

    1. Adelson E. H, Bergen J. R. Spatiotemporal energy models for the perception of motion. Journal of the Optical Society of America A. (1985);2(2):284–299. - PubMed
    1. Albrecht D. G, Geisler W. S. Motion selectivity and the contrast-response function of simple cells in the visual cortex. Visual Neuroscience. (1991);7(6):531–546. - PubMed
    1. Bell A. J, Sejnowski T. J. The “independent components” of natural scenes are edge filters. Vision Research. (1997);37(23):3327–3338. - PMC - PubMed
    1. Burge J, Geisler W. S. Optimal defocus estimation in individual natural images. Proceedings of the National Academy of Sciences, USA. (2011);108(40):16849–16854. - PMC - PubMed
    1. Burge J, Geisler W. S. Optimal defocus estimates from individual images for autofocusing a digital camera. Proceedings of the IS&T/SPIE 47th annual meeting Proceedings of SPIE. (2012). In.
Feedback