Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
, 9 (13), 17.1-16

Optimal Stimulus Encoders for Natural Tasks

Affiliations
Comparative Study

Optimal Stimulus Encoders for Natural Tasks

Wilson S Geisler et al. J Vis.

Erratum in

  • J Vis. 2010;10(2):27.1-2

Abstract

Determining the features of natural stimuli that are most useful for specific natural tasks is critical for understanding perceptual systems. A new approach is described that involves finding the optimal encoder for the natural task of interest, given a relatively small population of noisy "neurons" between the encoder and decoder. The optimal encoder, which necessarily specifies the most useful features, is found by maximizing accuracy in the natural task, where the decoder is the Bayesian ideal observer operating on the population responses. The approach is illustrated for a patch identification task, where the goal is to identify patches of natural image, and for a foreground identification task, where the goal is to identify which side of a natural surface boundary belongs to the foreground object. The optimal features (receptive fields) are intuitive and perform well in the two tasks. The approach also provides insight into general principles of neural encoding and decoding.

Figures

Figure 1
Figure 1
Framework for characterizing natural scene statistics for specific identification tasks. The category of object in the environment is represented by a vector ω(k), indexed by category number k. The proximal stimulus is represented by a vector s(k, l), indexed by category number k and exemplar number l. Thus, a randomly sampled natural stimulus in the natural task can be regarded as random sample (K, L) from an unknown joint probability distribution, p0(k, l). The proximal stimulus is encoded by a population of q neurons where the mean response of the population to a particular stimulus s(k, l) is rq(k, l) = [r1(k, l),…, rq(k, l)], and the variability of each neuron’s response (to repeated presentations of the same stimulus) is represented by an additive sample of noise Nq = [N1,…, Nq] whose variance may depend upon the mean response and that may be correlated across neurons. The population response is optimally decoded into an estimate ω̂ of the object category in the environment (the distal stimulus). The goal of accuracy maximization analysis is to determine the encoding functions [r1(k, l),…, rq(k, l)] that maximize accuracy in the identification task. (Bold letters represent vector quantities, capital letters represent random variables.)
Figure 2
Figure 2
Optimal linear receptive fields for a natural image patch identification task. a. Example set of 200 training patches randomly sampled from calibrated natural images. b. The final weights for the first six optimal linear RFs obtained by gradient descent from random weights. (For display purposes the receptive fields have been scaled so that the maximum absolute value is 1.0, and then interpolated by the plotting software.)
Figure 3
Figure 3
Evaluation of the approximation of the relative entropy of the average posterior probability distribution across categories computed by the ideal observer (the optimal decoder). a. The average relative entropy (across patches), determined by Monte Carlo simulation, as function of the number of receptive fields. b. The correlation between actual (simulated) and estimated relative entropy (using Equation 4) as a function of the number of receptive fields in the population. The correlations were computed over image patches in the training set. c. Average actual relative entropy as a function of average estimated relative entropy (using Equation 4) for different numbers of receptive fields in the population. The points show averages for the image patches in 8 quantiles.
Figure 4
Figure 4
Actual accuracy of the optimal decoder in the patch identification task, as determined by Monte Carlo simulation (i.e., applying Equation 1, trial by trial). a. Actual accuracy as a function of the number of optimal receptive fields. b. Actual accuracy vs. estimated relative entropy, for 200 test patches not in the training set.
Figure 5
Figure 5
One of 96 hand-segmented close-up images of foliage used to obtain random samples of surface boundary; segmented leaves (blue, brown); segmented branches (yellow). The brown leaf illustrates a single segmented object.
Figure 6
Figure 6
Foreground identification task. a. Example 12 × 12 pixel training patches. Each patch is centered on a randomly selected point along a surface boundary contour. In estimating the receptive fields all training patches were rotated to a canonical vertical orientation; when k = 1 the foreground was on the left when k = 2 the foreground was on the right. These patches have the foreground on the right. b. Final weights for first six optimal receptive fields. (For display purposes the receptive fields have been scaled so that the maximum absolute value is 1.0, and then interpolated by the plotting software.)
Figure 7
Figure 7
Actual accuracy of the optimal decoder in the foreground identification task, as determined by Monte Carlo simulation. a. Actual accuracy as a function of the number of optimal AMA receptive fields (solid symbols) and as a function of the number of PCA receptive fields (open symbols). b. Actual accuracy vs. estimated relative entropy, for 200 test patches not in the training set.
Figure 8
Figure 8
Principle components. a. First six principle components for the training data in the patch identification task (cf., Figure 2b). b. First six principle components for the training data in the foreground identification task (cf., Figure 6b). To maximize comparisons with the AMA receptive fields, all image patches were normalized to a mean of 0.0 and a standard deviation of 1.0 before computing the principle components (see Equation 6).

Similar articles

See all similar articles

Cited by 25 PubMed Central articles

See all "Cited by" articles

Publication types

Feedback