Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Mar 21;114(12):E2494-E2503.
doi: 10.1073/pnas.1619949114. Epub 2017 Mar 8.

Correlated variability modifies working memory fidelity in primate prefrontal neuronal ensembles

Affiliations

Correlated variability modifies working memory fidelity in primate prefrontal neuronal ensembles

Matthew L Leavitt et al. Proc Natl Acad Sci U S A. .

Abstract

Neurons in the primate lateral prefrontal cortex (LPFC) encode working memory (WM) representations via sustained firing, a phenomenon hypothesized to arise from recurrent dynamics within ensembles of interconnected neurons. Here, we tested this hypothesis by using microelectrode arrays to examine spike count correlations (rsc ) in LPFC neuronal ensembles during a spatial WM task. We found a pattern of pairwise rsc during WM maintenance indicative of stronger coupling between similarly tuned neurons and increased inhibition between dissimilarly tuned neurons. We then used a linear decoder to quantify the effects of the high-dimensional rsc structure on information coding in the neuronal ensembles. We found that the rsc structure could facilitate or impair coding, depending on the size of the ensemble and tuning properties of its constituent neurons. A simple optimization procedure demonstrated that near-maximum decoding performance could be achieved using a relatively small number of neurons. These WM-optimized subensembles were more signal correlation (rsignal )-diverse and anatomically dispersed than predicted by the statistics of the full recorded population of neurons, and they often contained neurons that were poorly WM-selective, yet enhanced coding fidelity by shaping the ensemble's rsc structure. We observed a pattern of rsc between LPFC neurons indicative of recurrent dynamics as a mechanism for WM-related activity and that the rsc structure can increase the fidelity of WM representations. Thus, WM coding in LPFC neuronal ensembles arises from a complex synergy between single neuron coding properties and multidimensional, ensemble-level phenomena.

Keywords: decoding; macaque; noise correlations; prefrontal cortex; working memory.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Task, method, and single-cell data. (A) Overview of oculomotor delayed-response task. The arrow represents the correct saccade direction. The dashed circles indicate potential cue locations and are shown for illustrative purposes only and are not present in the task. (B) Array implantation sites and anatomical landmarks in both subjects. (C) Example delay-selective neuron. (D) Distribution of delay-selective units’ preferred locations. FIX, fixation; ROI, region of interest; STIM, stimulus.
Fig. 2.
Fig. 2.
Measures of correlated variability and its effects on WM information in full ensembles. (A) Mean pairwise rsc (y axis) across task epochs (x axis), controlling for firing rate (SI Materials and Methods). The mean is computed across all 2,000 subsampled distributions, and shaded regions are SEM calculated using the sample size of a single subsampled distribution (n = 10,535 pairs). *P < 0.001, bootstrap test. (B) Mean rsc for each task epoch (y axis) as a function of delay epoch rsignal (x axis). The same subsampling procedure as in A was applied, and then the rsc of each neuron pair was binned based on its corresponding rsignal, and the mean rsc computed in each bin. rsignal bins are size = 0.2, stepped by increments of 0.05. The shaded regions are SEM, calculated using the sample size of the corresponding rsignal bin. (C) Median rsc for similarly tuned neuron pairs (rsignal > 0.25) and dissimilarly tuned neuron pairs (rsignal < −0.25) in each task epoch. The colored region around each point represents the bootstrapped 99.9% confidence interval of the median, derived from 2,000 bootstrap iterations. Nonoverlapping colored regions indicate P < 0.001, bootstrap test; however, pairwise comparisons that are visually ambiguous have explicitly marked (*) significant differences. FIX, fixation; STIM, stimulus.
Fig. S1.
Fig. S1.
Decoding in full ensembles. (A) Decoding performance for full ensembles in each session. (B) Effects of removing the rsc structure (Δshuffle) during the delay epoch for each session. Removing the rsc shuffling has a net effect of improving decoding accuracy in the full ensembles (i.e., ensembles including all simultaneously recorded neurons). The black line denotes the across-session mean. *P = 0.0032, paired t test. Δshuffle = [(accuracyrsc-shuffled/accuracyrsc-intact) − 1] × 100.
Fig. 3.
Fig. 3.
Accounting for between-neuron phenomena increases ensemble efficiency. Visualization of the (A) best individual unit ensemble construction procedure and (B) optimized ensemble construction procedure. Each circle represents a unit, and the shading represents that unit’s information content, as assessed using the decoder. (C) Decoding results for the best individual unit (teal) and optimized procedures (violet), applied to a single example session. The continuous line plot with circular markers shows the ensemble decoding accuracy (y axis) as a function of size (x axis). The square markers at the bottom of the plot denote the decoding accuracy (y axis) of the individual unit added to the ensemble at a given size (x axis). Both methods yield identical results for ensembles of the maximum size because these ensembles are identical; they consist of every simultaneously recorded unit in the session (i.e., the full ensemble). (D) Coding efficiency of the optimized method relative to the best individual unit method (y axis) as a function of ensemble size (x axis). Coding efficiency is quantified as [(accuracyoptimized/accuracybest individual unit) − 1] × 100. Colored lines are values for individual sessions. The thick black line is the across-session mean, and the gray shaded area is the SEM. The gray line running along the bottom indicates ensemble sizes for which the optimized method is significantly more efficient than the best individual unit method (P < 0.05, paired t test, Hochberg-corrected).
Fig. S2.
Fig. S2.
Decoding saturation curves for the best individual unit vs. best subensemble methods. Normalized decoding accuracy (y axis) as a function of normalized ensemble size (x axis) is plotted for the best individual unit (teal), optimized (violet), and random (gray) ensembles. The normalized ensemble size at which normalized decoding accuracy of 0.95 is achieved is shown for the three procedures.
Fig. 4.
Fig. 4.
Effects of rsc structure on ensemble coding efficiency and composition. (A) Decoding accuracy (y axis) as a function of ensemble size (x axis) for the best individual unit (teal), rsignal + rsc (violet), and rsignal-only (blue) methods for the same example session as in Fig. 3C. Note that, for the rsignal-only ensembles, the classifier was trained and tested on rsc-shuffled data whereas, for the rsignal + rsc and best individual unit ensembles, the classifier was trained and tested on rsc-intact data. (B) Coding efficiency of rsignal + rsc ensembles and rsignal-only ensembles, relative to the best individual unit ensembles (y axis), as a function of ensemble size (x axis). The violet line running along the bottom indicates ensemble sizes for which the rsignal + rsc ensembles are significantly more efficient than the best individual unit ensembles (P < 0.05, paired t test, Hochberg-corrected); the blue line is similar, but for rsignal-only ensembles vs. best individual unit ensembles. Note that the coding efficiency of rsignal + rsc ensembles relative to best individual unit ensembles was previously shown in Fig. 3D. (C) Coding efficiency of rsignal-only ensembles relative to rsignal + rsc ensembles; similar to Fig. 3D. A positive value indicates that shuffling out the rsc structure improves decoding. The striped blue and violet lines running along the bottom indicate ensemble sizes for which the efficiency of rsignal + rsc ensembles and rsignal-only ensembles are significantly different (P < 0.05, paired t test, Hochberg-corrected). (D) Decoding performance of rsc-shuffled vs. rsc-intact ensembles (∆shuffle, y axis) as a function of ensembles size (x axis) for random ensembles. Ensembles were generated by randomly subsampling n units from the full recorded population in a given session. The gray lines running along the bottom indicate ensemble sizes for which the rsc-shuffled vs. rsc-intact ensembles are significantly different (P < 0.05, paired t test, Hochberg-corrected). (E) Similarity between rsignal + rsc ensembles and rsignal-only ensembles (y axis) as a function of ensemble size (x axis). Ensemble similarity is quantified as the proportion of units common to the two ensembles for a given size. Note that ensemble similarity is 1 for ensembles of size n = 1, and for the largest ensemble size in a given session, because both ensemble-building procedures begin with the same unit, and the largest ensemble in each session consists of every simultaneously recorded unit in that session. The gray line running along the bottom indicates ensemble sizes for which the similarity of the rsignal + rsc ensembles and rsignal-only ensembles is significantly less than 1 (P < 0.05, z-test of proportion, Hochberg-corrected).
Fig. S3.
Fig. S3.
Effects of rsc structure on ensemble coding efficiency when using logistic regression. (A) Similar to Fig. 4B, but using logistic regression instead of SVM. Coding efficiency of rsignal + rsc ensembles and rsignal-only ensembles, relative to the best individual unit ensembles (y axis), as a function of ensemble size (x axis). The violet line running along the bottom indicates ensemble sizes for which the rsignal + rsc ensembles are significantly more efficient than the best individual unit ensembles (P < 0.05, paired t test, Hochberg-corrected); the blue line is similar, but for rsignal-only ensembles vs. best individual unit ensembles. (B) Coding efficiency of rsignal-only ensembles relative to rsignal + rsc ensembles; similar to Fig. 4C, but using logistic regression instead of SVM. A positive value indicates that shuffling out the rsc structure improves decoding. The striped blue and violet line running along the bottom indicates ensemble sizes for which the efficiency of rsignal + rsc ensembles and rsignal-only ensembles is significantly different (P < 0.05, paired t test, Hochberg-corrected).
Fig. S4.
Fig. S4.
Effects of the rsc structure during the stimulus vs. delay epochs. (A) Similar to Fig. 4C, but during the stimulus epoch. Coding efficiency of rsignal-only ensembles relative to rsignal + rsc ensembles for individual sessions. A positive value indicates that shuffling out the rsc structure improves decoding. Each colored line is an individual session. (B) Across-session mean coding efficiency of rsignal-only ensembles relative to rsignal + rsc ensembles during the stimulus epoch. The gray lines running along the bottom indicate ensemble sizes for which the efficiency of rsignal + rsc ensembles and rsignal-only ensembles is significantly different (P < 0.05, paired t test, Hochberg-corrected). (C) Identical data as Fig. 4C, presented again here for comparison with stimulus epoch data. Coding efficiency of rsignal-only ensembles relative to rsignal + rsc ensembles for individual sessions. (D) Identical data as Fig. 4C, presented again here for comparison with stimulus epoch data. Across-session mean coding efficiency of rsignal-only ensembles relative to rsignal + rsc ensembles.
Fig. S5.
Fig. S5.
Δshuffle and coding efficiency in random vs. optimized ensembles. (A) Δshuffle and ensemble size are correlated in random ensembles. The Spearman rank correlation coefficient (ρ) between ensemble size and Δshuffle in random ensembles, for each of the 12 recording sessions. A positive correlation indicates that shuffling out the rsc structure increases decoding accuracy more in larger ensembles. *P < 0.01, Spearman correlation. (B) rsignal + rsc ensembles are more efficient than random ensembles at nearly all ensemble sizes. Coding efficiency of the rsignal + rsc ensembles relative to random ensembles (y axis) as a function of ensemble size (x axis). A positive value indicates that the rsignal + rsc ensembles are more efficient for a given ensemble size. The gray line running along the bottom indicates ensemble sizes for which the rsignal + rsc ensembles are significantly more efficient than the random ensembles (P < 0.05, paired t test, Hochberg-corrected).
Fig. S6.
Fig. S6.
Ensemble construction sequence correlation shows that different ensembles maximize WM information when the rsc structure is intact vs. removed. (A) Sequence order correlation between rsignal + rsc ensembles (y axis) and rsignal-only ensembles (x axis) for an example session. The nth unit added to the ensemble in the rsignal-only method (x axis) is plotted against the n at which it was added to the ensemble in the rsignal + rsc method. For example, the fifth unit added in the rsignal-only method is added fourth in the rsignal + rsc method. If the sequence in which the two methods added units was identical, all of the points would fall along the unity line (gray), and the Spearman rank correlation coefficient (ρ) between the two sequences would equal 1, indicating that identical ensembles maximize WM information regardless of whether the rsc structure is intact. Likewise, ρ = 0 means that the relationship between the sequences is random; the ensembles that maximize WM information when the rsc structure is intact are entirely distinct from the ensembles that maximize WM information when the rsc structure has been removed. (B) Spearman rank correlation coefficient (ρ) between the rsignal + rsc ensembles and rsignal-only ensembles (y axis) for each session (x axis). Error bars are Bonferroni-corrected 95% confidence intervals. The correlation between ensemble sequences is significantly less than 1 in every session (P < 0.05, Bonferroni-corrected), indicating that the ensembles that maximize WM information when the rsc structure is intact are built in a different sequence than the ensembles that maximize WM information when the rsc structure has been removed. The range of ρ values (0.53 to 0.86) shows that there is some degree of similarity to the sequences in which the two methods recruit neurons to the ensembles.
Fig. 5.
Fig. 5.
Ensembles optimized for WM representation are rsignal-diverse and anatomically dispersed. (A) rsignal distributions for the full ensembles (gray; n = 12,222 units), near-max rsignal + rsc ensembles (violet; n = 2,414), and near-max rsignal-only ensembles (blue; n = 2,724), pooled across all sessions. All three distributions are significantly different from each other (P << 0.001, χ2 test, Bonferroni-corrected; computed using nonoverlapping bins of size = 0.1). (B) Mean |rsignal deviation| in the full (gray), near-max rsignal + rsc (violet), and near-max rsignal-only ensembles (blue). rsignal deviation is defined as the difference between a unit pair’s rsignal and the mean rsignal of the ensemble to which the unit pair belongs. **P << 0.001, Bonferroni-corrected, *P = 0.01, F test (SI Materials and Methods). Shaded regions represent Bonferroni-corrected 95% comparison intervals between group means (SI Materials and Methods). (C) Mean interunit distance in each of the three ensemble groups. *P < 0.005, F test, Bonferroni-corrected. Shaded regions represent Bonferroni-corrected 95% comparison intervals between group means. (D) Correlation between interunit distance and rsignal in the three ensemble groups. *P < 0.005, bootstrap test. Shaded regions represent bootstrapped 95% confidence intervals. (E) Mean interunit distance (y axis) as a function of rsignal in each of the ensemble groups, computed using nonoverlapping rsignal bins of size 0.1. Shaded region denotes SEM.
Fig. S7.
Fig. S7.
Functional anatomy analyses are consistent even when using different decoding saturation thresholds. (A) rsignal distributions for the full ensembles (gray), near-max rsignal + rsc ensembles (violet), and near-max rsignal-only ensembles (blue), pooled across all sessions. Identical to Fig. 5A, but using a threshold of 90% of maximum decoding. The rsignal + rsc ensemble and rsignal-only ensemble distributions are both significantly different from the full ensemble distributions (P < 0.001, χ2 test, Bonferroni-corrected; computed using nonoverlapping bins of size = 0.1). (B) Identical to A, but using a threshold of 80% of maximum decoding. All three distributions are significantly different from each other (P < 0.001, χ2 test, Bonferroni-corrected; computed using nonoverlapping bins of size = 0.1). (C) Mean |rsignal deviation| in the three categories of ensembles. Identical to Fig. 5B, but using a threshold of 90% of maximum decoding. rsignal deviation is defined as the difference between a unit pair’s rsignal and the mean rsignal of the ensemble to which the unit pair belongs. Shaded regions represent Bonferroni-corrected 95% comparison intervals between group means (Materials and Methods). (D) Identical to C, but using a threshold of 80% of maximum decoding. (E) Mean interunit distance in each of the three ensemble groups. Identical to Fig. 5C, but using a threshold of 90% of maximum decoding. Shaded regions represent Bonferroni-corrected 95% comparison intervals between group means. (F) Identical to E, but using a threshold of 80% of maximum decoding. (G) Correlation between interunit distance and rsignal in the three ensemble groups. Identical to Fig. 5D, but using a threshold of 90% of maximum decoding. Shaded regions represent bootstrapped 95% confidence intervals. (H) Identical to G, but using a threshold of 80% of maximum decoding. *P = 0.01, **P << 0.001, F test (Materials and Methods).
Fig. 6.
Fig. 6.
Nonselective neurons can increase ensemble information by modifying the rsc structure. (A) Two-neuron conceptual diagram of how a nonselective neuron could increase ensemble information content. In the first scenario (Left), one neuron differentiates between two stimuli (i.e., is selective; stimuli are denoted by blue and pink), and the other neuron does not (i.e., is not selective). The response variability of the two neurons is not correlated (i.e., rsc = 0). In the second scenario (Right), the individual neurons’ properties are identical, yet correlated response variability (i.e., the rsc structure) improves discrimination between the two stimuli relative to the uncorrelated scenario. (B) The continuous line plots with circular markers show the ensemble decoding accuracy (y axis) as a function of size (x axis) for the rsc + rsignal-optimized method for a single example ensemble, before decoding saturation, for rsc-intact data (magenta) and rsc-shuffled data (pale magenta). The square markers at the bottom of the plot denote the decoding accuracy (y axis) of the individual unit added to the ensemble at a given size (x axis). Notice units that are added to the population that are not selective (gray). (C) Change in decoding accuracy from adding nonselective units to presaturation ensembles (y axis) when the rsc structure is intact (left) and removed (right). Each line is the change for an individual unit. The bolded line is the median. Removing the rsc structure eliminates the information gain contributed by these units. *P = 0.001, signed-rank test; **P < 0.003, paired signed-rank test; ns (not significant), P = 0.43, signed-rank test; n = 16.
Fig. S8.
Fig. S8.
Descriptive statistics and control analyses for nonselective noise-shaping units. (A) Histogram of delay epoch firing rates of nonselective (P ≥ 0.05, Kruskal–Wallis ANOVA; blue) and selective (P < 0.05, Kruskal–Wallis ANOVA; orange) units in best ensembles. Firing rate bins are plotted on the x axis, and bin counts are plotted on the y axis. (B) Histogram of delay epoch decoding accuracies of nonselective and selective units in best ensembles. Decoding accuracy bins are plotted on the x axis, and bin counts are plotted on the y axis. (C) Parametric selectivity control. We reran the analysis in Fig. 6, defining selectivity as P < 0.05, ANOVA, instead of Kruskal–Wallis ANOVA. The change in decoding accuracy from adding nonselective units to best ensembles (y axis) is compared when the rsc structure is intact (left) and removed (right). Each line is the change for an individual unit. The bolded line is the median. Similar to when using a nonparametric measure of WM selectivity, removing the rsc structure eliminates the information gain contributed by these units. *P = 0.001, signed-rank test; **P = 0.01, paired signed-rank test; ns, P = 0.58, signed-rank test; n = 13. (D) Selective unit control. We used a distribution-matching procedure (Materials and Methods) to obtain a distribution of WM-selective units (P < 0.05, Kruskal–Wallis ANOVA) that contribute equivalent amounts of information to an ensemble as the nonselective units and performed a similar analysis as in Fig. 6. The center of each box is the median, and the notches extend from the median ± 1.57(q3 − q1)/√n, where q3 is the 75th percentile, q1 is the 25th percentile, and n = 13, the sample size of a single matched distribution. The bottom box edge = q1, top edge = q3, and the whiskers extend ∼99.3% distribution coverage. Removing the rsc structure does not change the magnitude of information added to the ensemble by these units (P > 0.05, bootstrap test) (Materials and Methods); thus, it is not simply the case that units that weakly improve ensemble information do so by modifying the rsc structure. ns, P > 0.05, bootstrap test. (E) Selective unit plus parametric selectivity control. Same as b, but defining selectivity as P < 0.05, ANOVA, instead of Kruskal–Wallis ANOVA. The results are similar for the two methods. ns (not significant), P > 0.05, bootstrap test.
Fig. S9.
Fig. S9.
Comparison of untuned noise-shaping neurons during stimulus and delay epochs. (A) Similar to Fig. 6C, but during the stimulus epoch. Change in decoding accuracy from adding nonselective units to presaturation ensembles (y axis) when the rsc structure is intact (left) and removed (right). Each line is the change for an individual unit. The bolded line is the median. #P = 0.013, signed-rank test; **P < 0.003, paired signed-rank test; ns, P > 0.05, signed-rank test; n = 15. (B) Contribution of untuned, noise-shaping units during the delay epoch. Identical to Fig. 6C, presented again here for comparison with stimulus epoch data. *P = 0.001, signed-rank test, **P < 0.003, paired signed-rank test; ns (not significant), P > 0.05, signed-rank test; n = 16.

Similar articles

Cited by

References

    1. Baddeley AD, Hitch G. Working memory. In: Bower GH, editor. The Psychology of Learning and Motivation: Advances in Research and Theory. Vol 8 Academic; New York: 1974.
    1. Miller EK, Cohen JD. An integrative theory of prefrontal cortex function. Annu Rev Neurosci. 2001;24:167–202. - PubMed
    1. Hebb DO. The Organization of Behavior: A Neuropsychological Theory. John Wiley & Sons; New York: 2005.
    1. Fuster JM, Alexander GE. Neuron activity related to short-term memory. Science. 1971;173(3997):652–654. - PubMed
    1. Funahashi S, Bruce CJ, Goldman-Rakic PS. Mnemonic coding of visual space in the monkey’s dorsolateral prefrontal cortex. J Neurophysiol. 1989;61(2):331–349. - PubMed

LinkOut - more resources