Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 2;21(8):10.
doi: 10.1167/jov.21.8.10.

Biased orientation representations can be explained by experience with nonuniform training set statistics

Affiliations

Biased orientation representations can be explained by experience with nonuniform training set statistics

Margaret Henderson et al. J Vis. .

Abstract

Visual acuity is better for vertical and horizontal compared to other orientations. This cross-species phenomenon is often explained by "efficient coding," whereby more neurons show sharper tuning for the orientations most common in natural vision. However, it is unclear if experience alone can account for such biases. Here, we measured orientation representations in a convolutional neural network, VGG-16, trained on modified versions of ImageNet (rotated by 0°, 22.5°, or 45° counterclockwise of upright). Discriminability for each model was highest near the orientations that were most common in the network's training set. Furthermore, there was an overrepresentation of narrowly tuned units selective for the most common orientations. These effects emerged in middle layers and increased with depth in the network, though this layer-wise pattern may depend on properties of the evaluation stimuli used. Biases emerged early in training, consistent with the possibility that nonuniform representations may play a functional role in the network's task performance. Together, our results suggest that biased orientation representations can emerge through experience with a nonuniform distribution of orientations, supporting the efficient coding hypothesis.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Evaluating orientation discriminability in a trained neural network model. (A) Schematic of the VGG-16 network architecture, with layers arranged from shallowest (left) to deepest. (B) Examples of oriented images used to measure orientation representations in the pretrained network. Images were generated by filtering ImageNet images within a narrow orientation range, preserving their broadband spatial frequency content. Orientations varied between 0°–179°, in steps of 1° (see Methods, Evaluation stimuli section). (C) Cartoon depiction of the approximate relationship between an example single unit tuning function and the Fisher information (FI) measured from that unit as a function of orientation. (D) Hypothetical depiction of the relationship between the prior distribution P over orientation θ, and the Fisher information (FI) when mutual information is maximized (Ganguli & Simoncelli, 2010; Wei & Stocker, 2015).
Figure 2.
Figure 2.
Pretrained VGG-16 shows maximum orientation information just off cardinal orientations, and nonuniformity in the distribution of single unit tuning properties. (A) FI is plotted as a function of orientation for several example layers of the pretrained model (navy blue) and a randomly initialized model (gray). See Methods, Computing Fisher information section for details. (B) Distribution of the tuning centers of pretrained network units that were well-fit by a Von Mises function. See Figure S1 for the proportion of well-fit units per layer, and the distribution of centers for the randomly initialized model. (C) Concentration parameter (k) versus center for individual units in the pretrained model (data in the top three panels of C have been downsampled to a maximum of 10,000 points for visualization purposes).
Figure 3.
Figure 3.
Cardinal bias in a pretrained VGG-16 model increases with depth. FIB-0, a measure of cardinal information bias (see Methods, Fisher information bias section), plotted for a pretrained model (navy blue) and a randomly initialized control model (gray), with asterisks indicating layers for which the pretrained model had significantly higher FIB-0 than the random model (one-tailed nonparametric t-test, FDR corrected q = 0.01). Error bars reflect standard deviation across four evaluation image sets.
Figure 4.
Figure 4.
Multivariate FI shows similar pattern of results as summed univariate FI. (A) Multivariate Fisher information, calculated after performing PCA, shown for four example layers in the pretrained model. See Methods, Multivariate analyses section for details. This figure reflects the calculation with first 10 PCs retained; similar results were found with different numbers of PCs retained. (B) Multivariate version of FIB-0 is plotted for each layer, comparing results with five, 10, or 20 principal components retained. Error bars reflect +1 standard deviation of the measure across four evaluation image sets.
Figure 5.
Figure 5.
Principal component analysis reveals a graded change in the structure of orientation representations across pretrained model layers. (A) For two example layers (conv1_1 and fc6), scatter plot of scores corresponding to the first two principal components of the layer's representation (see Methods, Multivariate analyses section). Colored points indicate individual images, with color indicating stimulus orientation. Black points indicate the mean of the 48 points corresponding to each orientation. (B). Scores for the first four principal components are plotted as a function of orientation for several example layers. Blue lines indicate the mean value of that principal component score as a function of orientation, gray lines indicate individual images. (C) Percent variance explained by each principal component of the data, after averaging across trials of a common orientation. Vertical line indicates the number of components after which additional components contribute <5% additional variance (see Methods, Multivariate analyses section).
Figure 6.
Figure 6.
Rotated images used to train VGG-16 networks. (A) Separate networks were trained on either upright or rotated versions of the ImageNet image set, with a smoothed circular mask applied to remove vertical and horizontal image edges. (B) Orientation content from images in each of the training sets in (A) was measured using a Gabor filter bank (see Methods, Measuring image set statistics section).
Figure 7.
Figure 7.
When networks are trained on rotated images, both population-level information and single unit tuning distributions reflect modified training set statistics. (AC) show data from one example layer (fc6) of four separately initialized networks trained on upright images, (DF) show data for fc6 of networks trained on images rotated 22.5° counterclockwise of upright, (GI) show data for fc6 of networks trained on images rotated 45° counterclockwise of upright. For each group of networks, panels (A,D,G) show FI plotted as a function of orientation, with error bars reflecting standard deviation across four networks with the same training image set. Panels (B,E,H) show distribution of fc6 unit tuning centers, combining data across networks. Panels (C,F,I) show concentration parameter (k) versus center for individual units.
Figure 8.
Figure 8.
Networks shows biases in orientation discriminability that are consistent with training set statistics. FIB-0, FIB-22, and FIB-45 represent the relative value of FI at cardinal orientations, 22.5° counterclockwise of cardinals, and 45° counterclockwise of cardinals, respectively, relative to a baseline (see Methods, Fisher information bias section). Panels show (A) FIB-0, (B) FIB-22, and (C) FIB-45 for models trained on each rotated version of ImageNet (colored), and randomly initialized models (gray). Colored asterisks indicate layers for which the models corresponding to that color had significantly higher FIB than the random models (one-tailed nonparametric t-test, FDR corrected q = 0.01). Error bars represent the standard deviation of the FIB over four initializations of each model and four evaluation image sets.
Figure 9.
Figure 9.
Biases in Fisher information and unit tuning properties over the course of training on upright images. (A) Fisher information for three example layers, at several timepoints during training (shades of blue; legend in C). For comparison, analyses in Figures 2 and 3 were performed at step 400,000 (darkest blue line). (B) Concentration parameter (k) versus tuning center, for individual units at conv4_3, plotted for several timepoints during training. Data have been downsampled to a maximum of 20,000 points for visualization purposes. (C) FIB-0 across layers plotted for several timepoints (shades of blue). Error bars reflect +1 standard deviation of the measure across four evaluation image sets. (D) FIB-0 is plotted as a function of time, for several example layers (purple lines). Light gray line indicates model performance (top-5 recall accuracy), after smoothing with a Gaussian kernel. Error bars reflect +1 standard deviation of the measure across four evaluation image sets.

Similar articles

Cited by

References

    1. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., &hellip; Zheng, X. (2016). TensorFlow: A system for large-scale machine learning. ArXiv. Retrieved from http://arxiv.org/abs/1605.08695
    1. Abbott, L. F., & Dayan, P. (1999). The effect of correlated variability on the accuracy of a population code. Neural Computation, 11(1), 91–101, 10.1162/089976699300016827. - DOI - PubMed
    1. Appelle, S. (1972). Perception and discrimination as a function of stimulus orientation: The “oblique effect” in man and animals. Psychological Bulletin, 78(4), 266–278, 10.1037/h0033117 - DOI - PubMed
    1. Barlow, H. B. (1961). Possible principles underlying the transformations of sensory messages. In Rosenblith, W. A. (Ed.), Sensory Communication (pp. 217–234). MIT Press, 10.7551/mitpress/9780262518420.003.0013. - DOI
    1. Bauer, J. A., Owens, D. A., Thomas, J., & Held, R. (1979). Monkeys show an oblique effect. Perception, 8(3), 247–253, 10.1068/p080247 - DOI - PubMed

Publication types