Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 18;16(2):e1007544.
doi: 10.1371/journal.pcbi.1007544. eCollection 2020 Feb.

Nonlinear mixed selectivity supports reliable neural computation

Affiliations

Nonlinear mixed selectivity supports reliable neural computation

W Jeffrey Johnston et al. PLoS Comput Biol. .

Abstract

Neuronal activity in the brain is variable, yet both perception and behavior are generally reliable. How does the brain achieve this? Here, we show that the conjunctive coding of multiple stimulus features, commonly known as nonlinear mixed selectivity, may be used by the brain to support reliable information transmission using unreliable neurons. Nonlinearly mixed feature representations have been observed throughout primary sensory, decision-making, and motor brain areas. In these areas, different features are almost always nonlinearly mixed to some degree, rather than represented separately or with only additive (linear) mixing, which we refer to as pure selectivity. Mixed selectivity has been previously shown to support flexible linear decoding for complex behavioral tasks. Here, we show that it has another important benefit: in many cases, it makes orders of magnitude fewer decoding errors than pure selectivity even when both forms of selectivity use the same number of spikes. This benefit holds for sensory, motor, and more abstract, cognitive representations. Further, we show experimental evidence that mixed selectivity exists in the brain even when it does not enable behaviorally useful linear decoding. This suggests that nonlinear mixed selectivity may be a general coding scheme exploited by the brain for reliable and efficient neural computation.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Mixed codes produce more discriminable stimulus representations.
A The noisy channel model. A stimulus x is encoded by encoding function tO(x) of order O. Next, the linear transform β is applied before independent Gaussian-distributed noise with variance σ2 is added to the representation. Finally, a decoder produces an estimate x^ of the original stimulus. B We analyze the encoding function with respect to three important code properties. The minimum distance Δ = d12 is the smallest distance between any pair of encoded stimuli (codewords), and half of that distance is the nearest border of the Voronoi diagram (background shading). Thus, minimum distance can be used to approximate the probability of decoding error. Representation energy P = r2 is the square of the radius of the circle that all of the codewords lie on. All of the codewords lie in a 2-dimensional plane, so the code has population size D = 2. C Stimuli are described by K features Ci which each take on |Ci| = n values. All possible combinations of feature values exist, so there are nK unique stimuli. D In pure selectivity (left), units in the code, or neurons, respond to a particular value of one feature and are invariant to changes in other features. In nonlinear mixed selectivity (right), neurons respond to particular combinations of feature values, and the number of feature values in those combinations is defined as the order O of the code (here, O = 2). E The same O = 1 and O = 2 code as in D. (top) The colored points are the response patterns in 3D response space for three of the four neurons in each code. The dashed grey line is the radius of the unit circle centered on the origin for each plane—the two codes are given constant representation energy, and all response patterns lie on the unit 4D hypersphere. For ease of visualization, the vertical dimension in the plot represents both the third and fourth neurons in the population to show three representations from the O = 1 code, this does not change the minimum distance. (bottom) The response patterns for the O = 2 mixed code have greater minimum distance than those for the O = 1 pure code. F We derive closed-form expressions for each code metric, and plots of each metric are shown for codes of order 1 to 6 with K = 6 and n = 10. G Mixed codes produce a higher minimum distance per unit representation energy (left) and have a smaller amount of representation energy per neuron than pure codes.
Fig 2
Fig 2. Mixed codes make fewer errors than pure codes.
A (top) Simulation of codes with O = 1, 2, 3 for K = 3 and n = 5. (inset) For high SNR, code performance is well-approximated by our estimate of error rate. (bottom) Same as above, except with K = 5 and n = 3. B (top) The estimated error rate at a fixed, high SNR (SNR = 9) for codes of every order given a variety of different K (all with n = 5). Error probability decreases with code order for all codes except, in some cases, the O = K code. (bottom) The number of errors made by the pure code for every error made by the optimal mixed code at SNR = 9 (as above). In all cases, pure codes make several orders of magnitude more errors than the optimal mixed code. C (left) Given a pool of neurons with fixed size, the color corresponding to the code producing the highest minimum distance is shown in the heat map. The shaded area delineates the order of magnitude of the number of neurons believed to be contained in 1 mm3 of mouse cortex. (right) The same as on the left, but instead of a pool of neurons of fixed size, each code is given a fixed total amount of energy. The energy is allocated to both passive maintenance of a neural population (with size equal to the population size of the code) and representation energy (increasing SNR). The shaded area is the same as on the left. The dashed lines are plots of our analytical solution for the transition point between the O and O + 1-order code (see Total energy in Methods).
Fig 3
Fig 3. Mixed codes can be more reliable than pure codes for both PE and MSE, but different RF sizes are appropriate for each.
A An illustration of our RF formalization. With K = 2 and n = 3, two example RFs of size σrf = 2 are shown. Simultaneous activity from both neurons uniquely specifies the center stimulus point. B Simulated PE of codes of all orders for K = 3 and n = 10 with σrf = 1, 2, 3 (legend as in C). Note that total energy is plotted on the x-axis, rather than SNR as in Fig 2. Mixed codes outperform the pure code over many (but not all) total energies. C The same as B but for MSE rather than PE. Mixed codes perform worse than pure codes for low total energy, but perform better as total energy increases. D PE increases (top) and MSE decreases (bottom) as σrf increases for the codes in B and C taken at the total energy denoted by the dashed grey line. E Cumulative distribution functions for the squared errors made by the codes given in B and C at the grey dashed line. MSE is decreased by increasing σrf despite the increase in PE because the errors that are made become smaller in magnitude and this outweighs their becoming more numerous. This effect is largest for the O = K = 3 code.
Fig 4
Fig 4. Mixed codes support reliable decoding in the brain, not only flexible computation.
A The learned, arbitrary category boundary on motion direction used in the saccade DMC task. B A schematic of the saccade DMC task. C A heatmap of the z-scored magnitude of the coefficients for each term in the linear model. It is sorted by largest magnitude term, from left to right. The linear models were fit using the LASSO method and terms were tested for significance using a permutation test (p < .05), only neurons with at least one significant term were included in this and the following plots (50/61 neurons). In addition, 10 out of 71 total recorded neurons were excluded due to having less than 15 recorded trials for at least one condition. D (top) The average strength of significant tuning for each term across the neural population, O = 1 tuning is on the left, and O = 2 tuning is on the right. (bottom) The proportion of neurons in the population that have pure selectivity (left) for the two saccade targets and two categories of motion and nonlinear mixed selectivity (right) for each of the four saccade target and category combinations. Error bars are bootstrapped 95% confidence intervals. E Single-feature decoding performance for a code chosen to mirror the conditions of the task, with K = 2 and n = 2. Mixing features together is advantageous even when decoding those features separately.

Similar articles

Cited by

References

    1. Barlow HB. Possible principles underlying the transformations of sensory messages. Sensory Communication. 1961; p. 217–234.
    1. Hyvarinen A, Oja E. Independent component analysis: Algorithms and applications. Neural Networks. 2000;13:411–430. 10.1016/s0893-6080(00)00026-5 - DOI - PubMed
    1. Barlow HB. Redundancy reduction revisited. Network. 2001;12(3):241–253. 10.1080/net.12.3.241.253 - DOI - PubMed
    1. Fusi S, Miller EK, Rigotti M. Why neurons mix: High dimensionality for higher cognition. Current Opinion in Neurobiology. 2016;37:66–74. 10.1016/j.conb.2016.01.010 - DOI - PubMed
    1. Gardner-Medwin A, Barlow HB. The limits of counting accuracy in distributed neural representations. Neural Computation. 2001;13(3):477–504. 10.1162/089976601300014420 - DOI - PubMed

Publication types