Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 539 (7628), 289-293

A Basal Ganglia Circuit for Evaluating Action Outcomes

Affiliations

A Basal Ganglia Circuit for Evaluating Action Outcomes

Marcus Stephenson-Jones et al. Nature.

Abstract

The basal ganglia, a group of subcortical nuclei, play a crucial role in decision-making by selecting actions and evaluating their outcomes. While much is known about the function of the basal ganglia circuitry in selection, how these nuclei contribute to outcome evaluation is less clear. Here we show that neurons in the habenula-projecting globus pallidus (GPh) in mice are essential for evaluating action outcomes and are regulated by a specific set of inputs from the basal ganglia. We find in a classical conditioning task that individual mouse GPh neurons bidirectionally encode whether an outcome is better or worse than expected. Mimicking these evaluation signals with optogenetic inhibition or excitation is sufficient to reinforce or discourage actions in a decision-making task. Moreover, cell-type-specific synaptic manipulations reveal that the inhibitory and excitatory inputs to the GPh are necessary for mice to appropriately evaluate positive and negative feedback, respectively. Finally, using rabies-virus-assisted monosynaptic tracing, we show that the GPh is embedded in a basal ganglia circuit wherein it receives inhibitory input from both striosomal and matrix compartments of the striatum, and excitatory input from the 'limbic' regions of the subthalamic nucleus. Our results provide evidence that information about the selection and evaluation of actions is channelled through distinct sets of basal ganglia circuits, with the GPh representing a key locus in which information of opposing valence is integrated to determine whether action outcomes are better or worse than expected.

Conflict of interest statement

The authors declare no competing financial interests. Readers are welcome to comment on the online version of the paper

Figures

Extended Data Figure 1
Extended Data Figure 1. Vglut2 and Somatostatin are markers for GPh neurons
a, Image showing the projection patterns of nonspecifically labelled neurons (green, infected with adeno-associated virus (AAV) expressing GCaMP6 (AAV1-Syn-GCAMP6f.WPRE.SV40); signal was enhanced by anti-GFP antibody; see Methods) and Vglut2+ neurons (red, infected with AAV expressing mCherry in a Cre-dependent manner (AAV8-hSyn-DIO-mCherry); signal was enhanced by anti-mCherry antibody; see Methods) in the EP of a Vglut2-cre mouse. b, Confocal images of the LHb, ventrolateral thalamus (VL) and ventromedial thalamus (VM), showing fibers originating from the nonspecifically labelled neurons (green) and Vglut2+ neurons (red) in the EP. c, Quantification of the GFP and mCherry fluorescence intensity in the projection targets of the EP neurons. d, Upper panel: representative image showing retrograde labelling of GPh neurons by injection of the cholera toxin subunit B conjugated to Alexa Fluor 594 (CTB-594) into the LHb (inset) of Vglut2-cre;Rosa26-stopflox-H2b-GFP mice, in which Vglut2+ cells can be identified based on their expression of nuclear GFP. Lower panels: high magnification pictures of the boxed area in the EP in the upper panel, showing the co-labelling of GPh neurons by CTB-594 and Vglut2 (arrowheads). The vast majority of CTB-labelled neurons expressed Vglut2 (95.45 ± 1.2% (mean ± s.e.m.), n = 6 mice). e, Upper panel: a representative image showing retrograde labelling of VM-projecting EP neurons by injection of CTB-594 into the VM (inset) of Vglut2-cre;Rosa26-stopflox-H2b-GFP mice. Lower panels: high magnification pictures of the boxed area in the EP in the upper panel, showing the segregation of the EP neurons labelled by CTB-594 and those labelled by Vglut2 (arrowheads). Very few CTB-labelled neurons expressed Vglut2 (0.51 ± 0.45%, n = 6 mice). f, Upper panel: a representative image showing retrograde labelling of VM-projecting EP neurons by injection of CTB-594 into the VM (inset). Lower panels: high magnification pictures of the boxed area in the EP in the upper panel, showing the segregation of the EP neurons labelled by CTB-594 and those labelled by anti-Som antibody (arrowheads). Very few CTB-labelled cells expressed Som (0.88 ± 0.72%, n = 5 mice). g. Upper panel: a representative image showing antibody labelling of Som in the EP of Vglut2-Cre;Rosa26-stopflox-H2b-GFP mice. Lower panels: high magnification pictures of the boxed area in the upper panel, showing the co-labelling of EP neurons by Som and Vglut2 (arrowheads). The vast majority of Vglut2 neurons expressed Som (90.87 ± 0.79%, n = 6 mice). h. A cartoon showing the only projection target of GPh neurons (red) and the different projection targets of classic GPi neurons (blue). Diagram in h was modified from the Allen Mouse Brain Atlas, Allen Institute for Brain Science; available from http://mouse.brain-map.org/.
Extended Data Figure 2
Extended Data Figure 2. Classification of EP neurons on the basis of their distinct response profiles
a, Schematic of the experimental approach used for in vivo recording and optogenetic tagging. b, Photomicrograph showing a DiI labelled recording site. c, Schematics showing the locations of the recording sites (n = 15 mice). d, Responses of three example neurons in the classic conditioning task. e, Left: auROC plots of the responses of all neurons during large reward trials. Red, increase from baseline; blue, decrease from baseline; each row represents one neuron. Green bars indicate the neurons that were “optogenetically tagged” (n = 11 neurons). The three main clusters are arranged in order to match the neurons presented in d. Right: first three principle components and hierarchical clustering dendrogram showing the relationship of each neuron within the three clusters. f, Average firing rates of the three types of neurons (n = 86 neurons from 9 mice). g, Plots of peristimulus time histogram (PSTH) showing inhibition for type I (top, n = 7 neurons from 4 mice), but no change for type II (middle, n = 9 neurons from 4 mice) or type III (bottom, n = 10 neurons from 4 mice) neurons in response to green light pulses (green bars, 200 ms; 100 trials per neuron, 0.3 Hz). Only type II and type III neurons that were recorded in the same sessions and animals as those of the light-responsive type I neurons represented in g are shown. h, auROC plots of the responses of all 38 neurons (n = 9 mice) recorded during large punishment trials. i & j, Average firing rates of type II (n = 11 neurons from 9 mice) (i) and type III (n = 11 neurons from 9 mice) (j) neurons during punishment trials. Diagrams in a and c were modified from the Allen Mouse Brain Atlas, Allen Institute for Brain Science; available from http://mouse.brain-map.org/.
Extended Data Figure 3
Extended Data Figure 3. Response profiles of putative GPh neurons during different CS-US contingencies
a, Graphs showing hierarchical clustering used to identify additional putative GPh neurons used in the analysis for this figure. All data shown (b–i) are from type I neurons only. Left: auROC plots of the responses of all additional neurons recorded. Red, increase from baseline; blue, decrease from baseline. Each row represents one neuron. Green bars indicate the neurons that were optogenetically tagged. Right: first three principle components and hierarchical clustering dendrogram showing the relationship of each neuron within the three clusters. b, auROC plots showing the firing rate changes in response to CS (top) and reward (bottom) prior to behavioural training. c, auROC plots showing the firing rate changes in response to CS (top) and airpuff (bottom) prior to behavioural training. d, auROC plots showing the firing rate changes in response to an expected (top) or unexpected (bottom) reward. e, auROC plots showing the firing rate changes in response to an expected (top) or unexpected (bottom) airpuff. f, auROC plots showing the firing rate changes in response to receiving an expected airpuff (left) or having an expected airpuff omitted (right). g, Histogram of difference in firing rate between airpuff omission and airpuff (filled bars, P < 0.05, t test). Values are represented using auROC. h, auROC plots showing the firing rate changes in response to receiving an expected reward (left) or having an expected reward omitted (right). i, Histogram of difference in firing rate between reward omission and reward (filled bars, P < 0.05, t test). Values are represented using auROC.
Extended Data Figure 4
Extended Data Figure 4. Optic fiber implantation locations
a, A schematic of the experimental approach used for Arch-mediated inhibition of GPh neurons. b, A photomicrograph showing the location of optic fibre placement and ArchT-GFP+ GPh neurons within the EP. c, Schematics showing the location of the optic fibre placements (n = 5). d, A schematic of the experimental approach used for Arch-mediated inhibition of the GPh-LHb projection. e, A photomicrograph showing the location of optic fibre placement and ArchT-GFP+ axon fibers within the LHb. f, Schematics showing the location of the optic fibre placements (n = 7). g, A schematic of the experimental approach used for Arch-mediated inhibition of the GPh, which was targeted retrogradely by injection of the LHb with CAV2-Cre. h, A photomicrograph showing the location of the optic fibre placement and ArchT-GFP+ neurons in the EP. i, Schematics showing the location of the optic fibre placements (n = 5). j, A schematic of the experimental approach used for ChR2-mediated excitation of the GPh, which was targeted retrogradely by injection of the LHb with CAV2-Cre. k, A photomicrograph showing the location of the optic fibre placement and ChR2-GFP+ neurons in the EP. l, Schematics showing the location of the optic fibre placements (n = 5). m, Schematic of the experimental approach used for ChR2-mediated activation of the GPh-LHb projection. n, A photomicrograph showing the optic fibre placement and ChR2-YFP+ axon fibres in the LHb. o, Schematics showing the location of the optic fibre placements (n = 6). Diagrams in a, c, d, f, g, i, j, l, m and o were modified from the Allen Mouse Brain Atlas, Allen Institute for Brain Science; available from http://mouse.brain-map.org/.
Extended Data Figure 5
Extended Data Figure 5. Optogenetic inhibition of the GPh drives reward-related behaviours
a, Confocal images from a Som-cre;Ai14 mouse, showing the overlap in expression of ChR2-YFP and tdTomato (indicating Som+ neurons) in GPh neurons. b, Quantification of the percentage of ChR2-YFP+ neurons that expressed tdTomato (n = 2). c, Confocal images from a Som-cre;Ai14 mouse, showing the overlap in expression of ArchT-YFP and tdTomato in GPh neurons. d, Quantification of the percentage of ArchT-YFP+ neurons that expressed tdTomato (n = 2). e, Schematic of the experimental approach used for ArchT-mediated inhibition of GPh neurons. f, Heatmaps for the activity of a representative mouse at baseline (top), or during optogenetic inhibition of the GPh in either the left (middle) or right (bottom) chamber. g, GPhArch mice (n = 5), but not GPheYFP mice (n = 5), showed a significant place preference for the chamber paired with laser stimulation in the GPh (F(5,29) = 14.95, P < 0.0001, ***P < 0.001, **P < 0.01, two-way ANOVA followed by Tukey’s test). h, Schematic of the experimental approach used for ArchT-mediated inhibition of GPh axon terminals in the LHb. i, GPhArchT mice (n = 7), but not GPheYFP mice (n = 5), showed a significant place preference for the chamber paired with laser stimulation in the GPh (F(5,35) = 52.22, P < 0.0001, ***P < 0.001, **P < 0.01, two way ANOVA followed by Tukey’s test). j, GPhArch mice (n = 5) made significantly more nose pokes than GPheYFP mice (n = 5) to obtain laser stimulation in the GPh (T(8) = 2.61, *P < 0.05, t test). k, Schematic of the retrograde labelling approach used to target the GPh for ArchT-mediated optical inhibition (top). GPhCAV-Cre/Arch mice (n = 5), but not GPheYFP mice (n = 5), showed a significant place preference for the chamber paired with laser stimulation in the GPh (bottom) (F(5,29) = 5.98, P < 0.01, *P < 0.05, two way ANOVA followed by Tukey’s test). l, Schematic of the retrograde labelling approach used to target the GPh for ChR2-mediated optical excitation (top). GPhCAV-Cre/ChR2 mice (n = 5), but not GPheYFP mice (n = 5), showed a significant place aversion for the chamber paired with laser stimulation in the GPh (bottom) (F(5,29) = 26.50, P < 0.0001; ***P < 0.001, **P < 0.01, two way ANOVA followed by Tukey’s test). m, Heatmaps for the activity of a representative mouse at baseline (top), or during optogenetic excitation of the GPh in either the left (middle) or right (bottom) chamber. n, Mice did not move faster (left) or further (right) during the Arch stimulation sessions when compared to their baseline activity (T(32) = 0.15, P > 0.05; T(32) = 0.16, P > 0.05; t test, n = 17). o, Mice did not move faster (left) or further (right) during the ChR2 stimulation sessions when compared to their baseline activity (T(8) = 0.12, P > 0.05; T(8) = 0.26, P > 0.05; t test, n = 5). All data are presented as mean ± s.e.m. Diagrams in e, h, k, and i were modified from the Allen Mouse Brain Atlas, Allen Institute for Brain Science; available from http://mouse.brain-map.org/.
Extended Data Figure 6
Extended Data Figure 6. A probabilistic switching task for studying action evaluation
a, Schematic of the task. b, The probability of choosing the left port by one mouse for reward history in which consecutive choices to either the right or the left port were made during the previous two trials. c, The contribution of rewarded and unrewarded outcomes in the previous 5 trials – represented by regression coefficients βReward and βNo Reward, respectively – to choices in the current trial (n = 10 mice, 4685 ± 786 trials per mouse). d, The fraction of left port choice for 10 mice plotted against the relative action value (sum of the regression coefficients from the previous two trials). Data from each mouse was grouped into 10 bins and represented by a distinct colour. e, The actual probability of choosing the left port plotted against the probability of choosing the left port predicted by the logistic regression model. f, Example data from one session showing 12 trial blocks. Blue bars represent left reward blocks (top); orange bars indicate right reward blocks (bottom). Green, orange, and red ticks respectively represent whether a particular trial was a correct rewarded trial, a correct unrewarded trial, or an incorrect trial. The grey dashed line represents a four-trial running average of the mouse’s probability of choosing the left port, and the black line indicates the probability of choosing the left port predicted by the logistic regression model. g & h, Change in chosen value one to three trials after optogenetic inhibition of the GPh (g), or activation of the GPh-LHb pathway (h). i, Changes in chosen value one trial after optogenetic activation or inhibition at the left or right reward port. In g and h, ****P < 0.0001, t test. In b, c and gi, data are represented as mean ± s.e.m.
Extended Data Figure 7
Extended Data Figure 7. Optogenetic inhibition or activation of the GPh-LHb pathway does not influence action selection
a, A schematic of optogenetic inhibition of the GPh at the point of action selection. b, Data points indicate the probability of left port choice as a function of action value for the trials in which the photo-stimulation was delivered at the center port (“stim”) or was not delivered (“no stim”). Lines indicate the fit by the logistic regression model on the pooled data for each of the two conditions (n = 5 mice, 15,411 trials, 3082 ± 1063 trials per mouse). c, Similar to b, except that control mice with eYFP-expressing GPh neurons were used (n = 6 mice, 56,241 trials, 9373 ± 596 trials per mouse). d & e, Similar to a and b, except that optogenetic activation of the GPh-LHb projection was applied at the point of action selection (n = 6 mice, 41,557 trials, 8311 ± 2565 trials per mouse). f, Similar to e, except that control mice with eYFP-expressing GPh neurons were used (n = 6 mice, 72,423 trials, 12070 ± 1673 trials per mouse). g & h, The changes in action value in response to optogenetic stimulation of the GPh-LHb pathway one to three trials after the photo-stimulation, for mice in which GPh neurons expressed Arch (n = 5) or eYFP (n = 6) (g), or ChR2 (n = 6) or eYFP (n = 6) (h). In b, c, e, and f, P values reported for t tests: H0: βstim = 0. i–l, Graphs showing the average withdrawal (calculated as the time from center port entry to exit) and movement (calculated as the time from center port exit to the poke at the chosen port) time for trials with or without light stimulation. Both withdrawal time and movement time were shorter when the action value associated with the chosen action was higher. Neither activation of GPh neurons with ChR2 (n = 6 mice) (i & j) (movement time for leftward choices, ChR2 stimulated trials (“ChR2”) vs. unstimulated trials (“no stim”), F(1,8) = 0.174, P > 0.05; rightward choices, ChR2 vs. no stim, F(1,8) = 1.352, P > 0.05; withdrawal time preceding leftward choices, ChR2 vs. no stim, F(1,8) = 0.667, P > 0.05; preceding rightward choices, ChR2 vs. no stim, F(1,8) = 0.599, P > 0.05; two way ANOVA), nor inhibition of these neurons with Arch (n = 5 mice) (k & l) (movement time for leftward choices, Arch stimulated trails (“Arch”) vs. unstimulated trials (“no stim”), F(1,8) = 0.105, P > 0.05; rightward choices, Arch vs. no stim, F(1,8) = 0.023, P > 0.05; withdrawal time preceding leftward choices, Arch vs. no stim, F(1,8) = 0.821, P > 0.05; preceding rightward choices, Arch vs. no stim, F(1,8) = 0.459, P > 0.05; two way ANOVA) had any significant effect on the ongoing behaviour. Data in gl are presented as mean ± s.e.m.
Extended Data Figure 8
Extended Data Figure 8. Weakening of excitatory or inhibitory synapses onto GPh neurons and its effects on the sensitivity to negative or positive feedback
a, Confocal images from a Som-cre;Ai14 mouse, showing the overlap in expression of GluA4-ct-GFP (delivered by injecting the EP with the AAV-DIO-GluA4-ct-GFP) and tdTomato (indicating the expression of Som) in GPh neurons. 97.86 ± 2.9% of GluA4-ct-GFP+ neurons expressed tdTomato (n = 2 mice). b, Schematics of the experimental approach. CTB-594 was injected into the LHb to label GPh neurons in the EP. On the right is an enlarged graph of the boxed area in the cartoon on the left. Inset is a photomicrograph showing simultaneous recording of a CTB+/GluA4-ct+ GPh neuron and a nearby CTB+/GluA4-ct GPh neuron. c, EPSC traces recorded from the two neurons shown in b. d, Quantification of the ratio between AMPA receptor-mediated EPSC amplitude and NMDA receptor-mediated EPSC amplitude (AMPA/NMDA ratio) for the two populations of GPh neurons (CTB+/GluA4-ct+, n = 6 cells; CTB+/GluA4-ct, n = 8 cells; n = 3 mice; T(12) = −1.89, *P < 0.05, t test). e, A representative image showing the expression of GluA4-ct-GFP (delivered by injecting the EP of a Vglut2-Cre mouse with the AAV-DIO-GluA4-ct-GFP) in GPh neurons (left) and a schematic of the approach (right). f, The win-stay percentage in these mice (GPhGluA4-ct, 94.17 ± 1.02%; GPheYFP, 95.82 ± 0.51%; P > 0.05, t test). g, For animals (n = 10 mice) used in Fig. 4a–e, the number of GPh neurons that were infected with the GluA4-ct-GFP virus correlated with the change in animal behaviour in the switching task, measured as an increase in action value following two consecutive unrewarded trials (R2 = 0.72, P < 0.05 by a linear regression). h. Contributions of rewarded outcomes over the past five trials, as reflected by their regression coefficients, to the current choice. GPhGluA4-ct mice were not significantly different from control mice or their pre-surgery condition (first two trials back x groups, F(3,33) = 0.5412, P > 0.05; two-way ANOVA, n = 10 GPhGluA4-ct mice and n = 7 control mice). i, The action value following two sequentially rewarded trials was not significantly different between GPhGluA4-ct mice and GPheYFP mice (P > 0.05, t test). j, Confocal images from a Som-flp mouse, showing the overlap in expression of Cre-GFP (delivered by injecting the EP with the AAV-FSF-GFP-Cre) and somatostatin, recognized through antibody labelling. 96.25 ± 2.3% of Cre-GFP+ neurons expressed somatostatin (n = 2 mice). k, Schematics of the experimental approach. CTB-594 was injected into the LHb to label GPh neurons in the EP. On the right is an enlarged graph of the boxed area in the cartoon on the left. l, Sample miniature IPSC (mIPSC) traces recorded from a GPh neuron that expressed Cre-GFP – and thus had γ2 ablated (γ2-KO) – and a control GPh neuron that did not express the Cre-GFP (γ2-WT). m, Quantification of the frequency (left) and amplitude (right) of mIPSCs recorded from the two groups of GPh neurons (γ2-KO, n = 7 cells; γ2-WT, n = 10 cells; n = 3 mice; frequency, T(15) = 5.51, ****P < 0.0001; amplitude, T(15) = 8.19, ****P < 0.0001; t test). n, A representative image showing the expression of Cre-GFP (delivered by injecting the EP of a Som-Flp;Gabrg2flox mouse with the AAV-FSF-GFP-Cre) in GPh neurons (left) and a schematic of the approach (right). o, The lose-switch percentage in these mice (P > 0.05, t test). p, For animals (n = 9) used in Fig. 4f–j, the number of GPh neurons that were infected with the Cre-GFP virus correlated with the change in animal behaviour in the switching task, measured as a reduction in action value following two consecutive rewarded trials (R2 = 0.53, P < 0.05 by a linear regression). q, The negative regression coefficients associated with the past five trials were not significantly different between GPhγ2-KO mice and control mice either before or after surgery (first two trials back x groups, F(3,35) = 0.9072, P > 0.05, n = 9 GPhγ2-KO mice and n = 9 control mice). r, The action value following two sequentially unrewarded trials was not significantly different between GPhγ2-KO mice and GPhmCherry mice (P > 0.05, t test). All data are represented as mean ± s.e.m. Diagrams in b, e, k and n were modified from the Allen Mouse Brain Atlas, Allen Institute for Brain Science; available from http://mouse.brain-map.org/.
Extended Data Figure 9
Extended Data Figure 9. Monosynaptic inputs onto the GPh and a schematic of the circuitry for reinforcement learning
a, Schematics of experimental design. The GPh neurons in the EP were targeted using either Vglut2-Cre;Rosa26-stopflox-tTA mice or by injecting the LHb of Rosa26-stopflox-tTA mice with the retrograde CAV2-Cre. b, Images showing the starter cell location in the EP. c, Relationship between the number of starter and input neurons. d, Graph showing the fraction of monosynaptically labelled neurons in each brain region that projects to the GPh (n = 9 mice) e, Confocal images of the rabies virus and parvalbumin (PV) labelled neurons in the GPe. Only a small fraction of the virally labelled GPe cells expressed PV (arrows). On the right is a high magnification image of the boxed area in the GPe. f, Quantification of the fraction of rabies virus labelled GPe neurons that expressed PV (n = 3 mice). g, Center of mass analysis for all GPe labelled neurons (n = 9 mice). h, A confocal image of the parasubthalamic nucleus (pSTN) showing monosynaptically labelled neurons. i, Center of mass analysis for all pSTN labled neurons (n = 9 mice). j, A schematic showing the proposed selection and evaluation circuits within the basal ganglia. Question marks indicate elements of the proposed circuit that remain to be tested experimentally. Diagrams in a, g and i were modified from the Allen Mouse Brain Atlas, Allen Institute for Brain Science; available from http://mouse.brain-map.org/.
Extended Data Figure 10
Extended Data Figure 10. The proposed function of the basal ganglia and midbrain evaluation circuits
a, schematic showing the activity of GPh neurons and the downstream circuitry controlling the midbrain dopaminergic system. b, Proposed sequence of events by which GPh activity may influence the firing rate in downstream structures. Upward arrows indicate an increase in firing; downward arrows indicate a decrease in firing. RMTg, Rostromedial tegmental nucleus; SNc, Substantia nigra pars compacta; VTA, ventral tegmental area; DA, dopamine. DR, dorsal raphe; MR, median raphe. ? indicates that alternative circuits downstream of the LHb, including the serotonergic raphe nuclei, may constitute other key pathways that also process the GPh-LHb prediction error signals that we demonstrate in this study. Diagram in a was modified from the Allen Mouse Brain Atlas, Allen Institute for Brain Science; available from http://mouse.brain-map.org/.
Figure 1
Figure 1. GPh neurons bidirectionally integrate reward and punishment related information
a & b, Licking (a) and blinking (b) behaviour from a representative experimental session. The dashed boxed area and dashed line indicate the time of CS and US delivery, respectively. Licking rate (n = 30 sessions from 7 mice, F(2,18) = 41.59, P < 0.0001, P < 0.05 for all comparisons) and blinking occurrence (n = 32 sessions from 4 mice, F(2,9) = 33.13, P < 0.001, P < 0.05 for all comparisons) during the delay between CS and US in recording sessions were compared across different US magnitudes with one-way ANOVA followed by Tukey’s test. c, Responses of an example putative GPh neuron, shown as spike density functions. d & e, auROC (area under the Receiver Operating Characteristic) analysis of differences in firing rate between big and small reward trials (d), or between big and small punishment trials (e), during the peak response to the CS presentation (180–480 ms). Filled bars, P < 0.05, t test. f, Average response of putative GPh neurons to reward. g, Firing rate change during CS predicting reward of different amplitudes (Big vs. Small reward, z = −3.2, **P < 0.01; Small vs. No reward, z = −4.11, ****P < 0.0001; Wilcoxon signed-rank test). h, Average response of putative GPh neurons to punishment. i, Firing rate change during CS predicting punishment of different durations (Big vs. Small punishment, z = 2.27, *P < 0.05; Small vs. No punishment, z = 2.06, *P < 0.05, Wilcoxon signed-rank test). Data are represented as mean ± s.e.m. in a, b, fi.
Figure 2
Figure 2. GPh responses to unconditioned stimuli are modulated by expectation
a, CS and US (airpuff) responses of an example putative GPh neuron tracked over multiple sessions. Session-by-session waveform correlations for this individual unit were >0.96. b, CS–US (airpuff) response index for 36 putative GPh neurons (6 mice) across different stages of training (red dots represent values of the sample neuron in a) (r2 = 0.56, P < 0.0001 by a linear regression). c, Responses of an example GPh neuron to an expected airpuff (red), an unexpectedly omitted airpuff (orange), or an unsignalled reward (grey). d, auROC analysis of differences in firing rate between baseline and US presentation time (1.7–1.9 s) in airpuff omission trials (n = 21 neurons from 6 mice). Filled bars, P < 0.05, t test. e, CS and US (reward) responses of an example putative GPh neuron tracked over multiple sessions. Session-by-session waveform correlations for this individual unit were >0.97. f, CS–US (reward) response index for 60 putative GPh neurons (9 mice) across different stages of training (the blue dots represent values of the sample neuron in e) (r2 = 0.48, P < 0.0001 by a linear regression). g, Responses of an example GPh neuron to an expected reward (blue), an unexpectedly omitted reward (light blue), or an unsignalled airpuff (grey). h, auROC analysis of differences in firing rate between baseline and US presentation time (1.7–1.9 s) in reward omission trials (n = 15 neurons from 4 mice). Filled bars, P < 0.05, t test.
Figure 3
Figure 3. Optogenetic inhibition or activation of the GPh-LHb pathway bidirectionally influences reinforcement
a, Schematic of the optogenetic inhibition. b, Probability of left port choice as a function of action value, for trials immediately following the trials in which photo-inhibition was delivered when mice entered the left (“stim left”) or right (“stim right”) port, or was not delivered (“no stim”). Coloured lines indicate the fit by the logistic regression model on the pooled data for each of the three conditions; grey lines indicate the pooled data for each individual mouse (n = 5 mice, 34,627 trials, 6,943 ± 1330 trials per mouse). c, Similar to b, except that control mice with eYFP-expressing GPh neurons were used (n = 6 mice, 79,589 trials, 13,265 ± 596 trials per mouse). d & e, Similar to a and b, except that optogenetic activation of the GPh-LHb projection was applied (n = 6 mice, 42,292 trials, 4,424 ± 1806 trials per mouse). f, Similar to e, except that control mice with eYFP-expressing GPh neurons were used (n = 6 mice, 45,389 trials, 7,564 ± 2120 trials per mouse). In b, c, e, and f, P values are reported for t tests: H0: βstim = 0.
Figure 4
Figure 4. Reducing glutamatergic or GABAergic drive onto GPh neurons decreases sensitivity to negative or positive feedback, respectively
a, Bar graphs showing the increased perseverance of GPhGluA4-ct mice (n = 10) compared to GPheYFP controls (in which eYFP was introduced into GPh neurons by a Cre-dependent virus; n = 7). For clarity, only choices for which mice had previously made two consecutive responses at the same port are shown. b, The lose-switch percentage in these mice (GPhGluA4-ct, 31.13 ± 1.7%; GPheYFP, 45.67 ± 5.1%; T(13) = 2.58, *P < 0.05, t test). c, The number of trials mice took before reversing choice after reward contingencies were switched (GPhGluA4-ct, 4.59 ± 0.18 trials; GPheYFP, 2.56 ± 0.15 trials; T(13) = 6.02, ****P < 0.0001, t test). d, The negative regression coefficients associated with the past five trials for GPhGluA4-ct mice and GPheYFP mice before and after surgery (first two trials back x groups, F(3,33) = 6.566, P < 0.01; *P < 0.05 for GPhGluA4-ct compared to all other groups on the second trial back; two-way ANOVA followed by Bonferroni’s test). e, The action value of two sequentially unrewarded trials, derived from the sum of their regression coefficients (T(16) = 3.46, **P < 0.01, t test). f, Bar graphs showing decreased perseverance in GPhγ2-KO mice (n = 8) compared to GPhmCherry controls (in which mCherry was introduced into GPh neurons by a Flp-dependent virus; n = 8). Only choices where mice previously made two consecutive responses at the same port are shown. g, The win-stay percentage in these mice (GPhγ2-KO, 89.0 ± 0.7%; GPhmCherry, 95.6 ± 0.5%; T(16) = −6.61, ****P < 0.0001, t test). h, The number of trials mice took before reversing choice after reward contingencies were switched (GPhγ2-KO, 2.08 ± 0.26 trials; GPhmCherry, 2.45 ± 0.27 trials; T(16) = −2.74, *P < 0.05, t test). i, The positive regression coefficients associated with the past five trials for GPhγ2-KO mice and GPhmCherry mice before and after surgery (first two trials back x groups, F(3,31) = 42.10, P < 0.0001; ****P < 0.0001 for GPhγ2-KO compared to all other groups on the first trial back; two-way ANOVA followed by Bonferroni’s test). j, The action value of two sequentially rewarded trials, derived from the sum of their regression coefficients (T(16) = −7.49, ****P < 0.0001, t test). All data are represented as mean ± s.e.m.
Figure 5
Figure 5. Identification of monosynaptic inputs to the GPh
a, Series of coronal sections, ipsilateral to site of injection, from a representative mouse (i.e., CAV5) showing the major monosynaptic inputs to the GPh. b, Confocal image of the dorsal striatum (DS) with monosynaptically labelled neurons (green) and Mu Opioid Receptor (MOR) immunostaining that labels the striosomes (red). c, Quantification of monosynaptically labelled cells in striatal subcompartments. d, The injection strategy. e, Schematic of recording configuration (upper) and sample inhibitory postsynaptic currents (IPSCs) induced by optogenetic activation of the striatal input to the GPh (lower left). These IPSCs were blocked by picrotoxin (PTX) (lower right). f, Image of the pSTN with monosynaptically labelled neurons (red). g, The injection strategy. h, Schematic of recording configuration (upper) and sample excitatory postsynaptic currents (EPSCs) induced by optogenetic activation of pSTN input to the GPh (lower left). These EPSCs were blocked by CNQX (lower right). Diagrams in d and g are modified from the Allen Mouse Brain Atlas, Allen Institute for Brain Science; available from http://mouse.brain-map.org/.

Similar articles

See all similar articles

Cited by 41 articles

See all "Cited by" articles

References

    1. Nelson AB, Kreitzer AC. Reassessing models of basal ganglia function and dysfunction. Annu Rev Neurosci. 2014;37:117–135. doi: 10.1146/annurev-neuro-071013-013916. - DOI - PMC - PubMed
    1. Amemori K, Gibb LG, Graybiel AM. Shifting responsibly: the importance of striatal modularity to reinforcement learning in uncertain environments. Front Hum Neurosci. 2011;5:47. doi: 10.3389/fnhum.2011.00047. - DOI - PMC - PubMed
    1. Hikosaka O. Basal ganglia mechanisms of reward-oriented eye movement. Ann N Y Acad Sci. 2007;1104:229–249. doi: 10.1196/annals.1390.012. - DOI - PubMed
    1. Alexander GE, Crutcher MD. Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends Neurosci. 1990;13:266–271. - PubMed
    1. Callaway EM, Luo L. Monosynaptic Circuit Tracing with Glycoprotein-Deleted Rabies Viruses. J Neurosci. 2015;35:8979–8985. doi: 10.1523/JNEUROSCI.0409-15.2015. - DOI - PMC - PubMed

Publication types

MeSH terms

Feedback