Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Oct 19;107(42):18149-54.
doi: 10.1073/pnas.0914916107. Epub 2010 Oct 5.

Local statistics in natural scenes predict the saliency of synthetic textures

Affiliations

Local statistics in natural scenes predict the saliency of synthetic textures

Gasper Tkacik et al. Proc Natl Acad Sci U S A. .

Abstract

The visual system is challenged with extracting and representing behaviorally relevant information contained in natural inputs of great complexity and detail. This task begins in the sensory periphery: retinal receptive fields and circuits are matched to the first and second-order statistical structure of natural inputs. This matching enables the retina to remove stimulus components that are predictable (and therefore uninformative), and primarily transmit what is unpredictable (and therefore informative). Here we show that this design principle applies to more complex aspects of natural scenes, and to central visual processing. We do this by classifying high-order statistics of natural scenes according to whether they are uninformative vs. informative. We find that the uninformative ones are perceptually nonsalient, while the informative ones are highly salient, and correspond to previously identified perceptual mechanisms whose neural basis is likely central. Our results suggest that the principle of efficient coding not only accounts for filtering operations in the sensory periphery, but also shapes subsequent stages of sensory processing that are sensitive to high-order image statistics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Local distributions of light in natural scenes. (A) Natural images are discretized into 16 equipopulated grayscale levels. Central pixels with intensity σ0 are chosen randomly, and the distribution of intensities σ1 at a radius R from the center is sampled. The distance R is represented by the scale bar in pixels. (B), (C), and (D) Histograms of pixel intensities at different distances from the center (shown here for formula image) if the central pixel σ0 is black, gray, or white (black, gray, and white lines). Note the difference in histograms for black vs. white central pixel σ0 for small R (see SI Appendix for P(σ1) as a function of R).
Fig. 2.
Fig. 2.
Variability of local intensity histograms. (A) The three principal components of the ensemble of local intensity histograms PR(σ1|σ0) (blue, green, and red), along with the fraction of variance explained by each (bar chart inset). Together, the three components explain ∼90% of the variance between histograms. (B) An orthogonal transformation rotates the three principal components, {vj} → {wj}, so that w1 is as close as possible to a linear function of σ1, and w2 to a quadratic function of σ1 (open blue circles = linear function, θ1(σ1); open green circles = quadratic function, θ2(σ1)). The three new axes {wj} relate to variations in mean, variance, and blackshot of the intensity histogram. (C), (D), and (E) IID texture pairs. The intensity of each pixel is chosen independently according to the inset distributions which vary from the uniform distribution by adding or subtracting one principal component wj from B. These additions vary the mean (C), variance (D) or blackshot (E) of the texture. Only IID textures that vary in at least one of these three ways can be reliably discriminated by humans (15).
Fig. 3.
Fig. 3.
Spatially correlated textures. (A) Correlations of different orders between four pixels. There are four mean pixel luminances (pink circles), six pairwise correlations (blue lines), four triplet correlations (green triangles), and one quadruplet (fourth-order) correlation (red square); translation invariance reduces the number independent quantities to 10 (numbers in parenthesis). (B) Examples of gliders and the textures they generate (see Materials and Methods). Both displayed textures have equally many white and black pixels, have no second- or third-order correlations, and a large fourth-order correlation. Gliders from Group 1 generate textures that are perceptually salient against a white binary noise background, while textures generated from gliders in Group 2 are not perceptually salient (16, 24). (C) An example of a distribution over binary patterns in a square glider. This distribution generates synthetic textures that have only fourth-order correlations (example texture in 3B, left). (D) To measure the fourth-order correlations in natural scenes we select patches of R × R pixels from whitened natural scenes binarized to have equally many white and black pixels. Each of the eight gliders in B (a square glider shown here in red) is scanned across a patch, and the histogram of binary patterns encountered by the glider is accumulated. (E) Histogram of binary patterns encountered by a square glider scanning a 64 × 64 patch from a natural image. (F) The information about texture in a 64 × 64 binary image patch that is contained in second-, third-, and fourth-order correlations, extracted with a square glider.
Fig. 4.
Fig. 4.
Fourth-order correlations and perceptual salience. (A) Decomposition of textural information into second (blue) , third (green), and fourth (red) order for the two groups of gliders and many spatial scales (central line = mean, thin surrounding lines = std across gliders). In large image patches there is significantly more information about texture in the correlations between four pixels arranged in the patterns from Group 1 gliders, which also generate perceptually salient textures. Group 1 and Group 2 gliders have similar amounts of I(2,3). (B) Fourth-order correlations as measured by the parameter formula image Eq. 1. Results at each R are averaged across Group 1 gliders (solid, circles) and Group 2 gliders (dashed, squares), and across many R × R texture patches. The shaded areas show the standard deviation of formula image across texture patches for the two groups. As R increases the correlations within the perceptually salient gliders acquire high statistical significance. (C) The Jensen-Shannon distance, DJS, between the distributions of formula image sampled across many R × R image patches, for all pairs of gliders (arrangements of four pixels, see Fig. 3B). As R increases, the gliders cluster into two sets, which respectively generate the perceptually salient (Group 1) and nonsalient (Group 2) textures determined by psychophysical studies (17).

Similar articles

Cited by

References

    1. Barlow HB. In: Sensory Communication. Rosenblith W, editor. Cambridge, MA: MIT Press; 1961. pp. 217–234.
    1. Srinivasan MV, Laughlin SB, Dubs A. Predictive coding: a fresh view of inhibition in the retina. Proceedings of the Royal Society B (London) 1982;216:427–459. - PubMed
    1. Atick JJ, Redlich AN. Towards a theory of early visual processing. Neural Comput. 1990;2:308–320.
    1. Balasubramanian V, Sterling P. Receptive fields and the functional architecture in the retina. J Physiol. 2009;587:2753–2767. - PMC - PubMed
    1. Atick JJ, Li Z, Redlich AN. Understanding retinal color coding from first principles. Neural Comput. 1992;4:449–572.

Publication types

LinkOut - more resources