Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jun 13:7:e35988.
doi: 10.7554/eLife.35988.

Orbital frontal cortex updates state-induced value change for decision-making

Affiliations

Orbital frontal cortex updates state-induced value change for decision-making

Emily T Baltz et al. Elife. .

Abstract

Recent hypotheses have posited that orbital frontal cortex (OFC) is important for using inferred consequences to guide behavior. Less clear is OFC's contribution to goal-directed or model-based behavior, where the decision to act is controlled by previous experience with the consequence or outcome. Investigating OFC's role in learning about changed outcomes separate from decision-making is not trivial and often the two are confounded. Here we adapted an incentive learning task to mice, where we investigated processes controlling experience-based outcome updating independent from inferred action control. We found chemogenetic OFC attenuation did not alter the ability to perceive motivational state-induced changes in outcome value but did prevent the experience-based updating of this change. Optogenetic inhibition of OFC excitatory neuron activity selectively when experiencing an outcome change disrupted the ability to update, leaving mice unable to infer the appropriate behavior. Our findings support a role for OFC in learning that controls decision-making.

Keywords: action control; mouse; neuroscience; orbital frontal cortex; value.

PubMed Disclaimer

Conflict of interest statement

EB, EY, RR, CG No competing interests declared

Figures

Figure 1.
Figure 1.. Positive incentive learning in mice.
(A) Schematic showing training, re-exposure and testing schedule for positive incentive learning. Group n’s: 2→ 2: n = 5; 2→ 16: n = 11. Data points and bar graphs represent the mean ± SEM. (B) Number of licks and (C) licking rate (10 min bins) during the re-exposure session. (D) The average latency to begin licking after a sucrose delivery (s). (E) The ratio of licks that occur in bursts, (F) average duration of bursts (ms), (G) average number of licks within a burst, and (H) average interlick interval within bursts (ms). (I) Average burst duration after a sucrose delivery and (J) average number of licks within a burst after a sucrose delivery. (K) Response rate during the 5 min non-rewarded test as a percent of acquisition response rate (last 2 days of training). (L) Schematic of training, re-exposure, and testing schedule for context positive incentive learning. Group n’s: context + sucrose -: n = 5; context + sucrose + : n = 11; context - sucrose -: n = 11; context – sucrose +: n = 11 (M). (J) Percent of baseline responding (last two training days) for mice not exposed to sucrose, not exposed to sucrose nor the context, exposed to sucrose in the context, and exposed to sucrose in the home cage. * indicates p=0.05, # indicates p=0.06.
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. Acquisition of lever pressing for positive and context incentive learning.
(A) Responses per minute during training days on the left lever for positive incentive learning group. Two-way repeated measured ANOVA: no interaction; no effect of group; main effect of training day (F (7, 70)=3.375, p=0.0037). (B) Left lever presses for each training day for incentive learning. Two-way repeated measures ANOVA: no interaction; no effect of group; main effect of training day: F (7,70)=3.37, p=0.0037. (C) Sucrose outcomes earned each training day during positive incentive learning. Two-way repeated measures ANOVA: no interaction; no effect of group; main effect of training day: F(10,100) = 2.58, p=0.008. (D) For context experiment, mice were trained at a 2 hr food restriction and tested at a 16 hr restriction. Mice were either re-exposed to sucrose in the operant context, in their home cage, or were not re-exposed to sucrose and had the same amount of time in the operant context or home cage. Responses per minute of mice trained at a 2 hr restriction. Two-way repeated measures ANOVA: No interaction or main effects (Fs < 1.52, ps >.19) (E) Left lever presses over each training day. Two-way repeated measures ANOVA: no interaction: F = 1.02, p=0.44; no main effect of Group: F = 0.90, p=0.46; main effect of training day: F(5,130) = 2.82, p=0.0187. (F) Sucrose outcomes earned over training days. Two-way repeated measures ANOVA: no Group x training day interaction, F = 0.90, p=0.61; no main effect of group, F = 2.38, p=0.09; main effect of training day: F(8,232) = 9.72, p<0.0001. (G) Left lever presses on non-rewarded test day for positive incentive learning. Unpaired t-test: F < 1.01, p>0.99.
Figure 2.
Figure 2.. Negative incentive learning in mice.
(A) Schematic showing training, re-exposure, and testing schedule for negative incentive learning. Group n’s: 16→ 16: n = 8; 16→ 2: n = 7. Data points and bar graphs represent the mean ± SEM. (B) Number of licks and (C) licking rate (10 min bins) during the re-exposure session. (D) The average latency to begin licking after a sucrose delivery (s). (E) The ratio of licks that occur in bursts, (F) average duration of bursts (ms), (G) average number of licks within a burst, and (H) average interlick interval within bursts (ms). (I) Average burst duration after a sucrose delivery and (J) average number of licks within a burst after a sucrose delivery. (K) Response rate during the 5 min non-rewarded test as a percent of acquisition response rate (last 2 days of training). * indicates p<0.05.
Figure 2—figure supplement 1.
Figure 2—figure supplement 1.. Acquisition of lever pressing for negative incentive learning.
(A) Responses per minute during training days on the left lever for negative incentive learning group. Two-way ANOVA: interaction (F (7,91)=3.97, p=0.0008); no effect of group; main effect of training day (F (7, 91)=9.64, p<0.0001). (B) Left lever presses for each training day for negative incentive learning. Negative incentive learning: interaction: F (7,91)=3.29, p=0.0036; no effect of group; main effect of training day: F (7,91)=16.14, p<0.0001. (C) Sucrose outcomes earned each training day for negative incentive learning. Negative incentive learning: no interaction; no effect of group; main effect of training day: F (10,130)=7.77, p<0.0001. (D) Left lever presses on non-rewarded test day for negative incentive learning. Unpaired t-test: F = 2.58, p=0.1489.
Figure 3.
Figure 3.. Orbitofrontal cortex attenuation prevents positive incentive learning.
(A) Schematic of injection site (left) and representative mCherry expression in OFC (right). (B) Representative traces and (C) summary data from ex vivo physiological whole cell recordings in HMD4i expressing OFC projection neurons during baseline and following CNO bath application to H4 slice. (cells n = 8). (D) Training and testing schematic showing when OFC manipulations were given, with CNO given only during the re-exposure session. Group n’s: 2→ 2 Ctl: n = 9; 2→ 16 Ctl: n = 14; 2→ 2 H4: n = 7; 2 → 16 H4: n = 17. (E) Number of licks and (F) licking rate (10 min bins) during the re-exposure session. (G) The average latency to begin licking after a sucrose delivery (s). (H) The ratio of licks that occur in bursts, (I) average duration of bursts (s), (J) average number of licks within a burst, and (K) average interlick interval within bursts (ms). (L) Average burst duration after a sucrose delivery (s) and (M) average number of licks within a burst after a sucrose delivery. (N) Response rate during the 5 min non-rewarded test as a percent of acquisition response rate (last 2 days of training). (O) Percent of baseline responding from non-rewarded to the rewarded test Data points represent individual subjects and bar graphs and error bars represent the mean ± SEM. * indicates p<0.05.
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. Orbitofrontal cortex excitation does not generate an increased motivational state.
(A) Spikes and trace from H3 slice data (cells n = 7). Two-way repeated measures ANOVA: interaction of Current x Treatment: F(10,60) = 2.94; Main effect of Treatment: F(1,6) = 23.09, p=0.003; main effect of current: F(10,60) = 29.63, p<0.0001. (B) Total licks during RT session. (Group n’s: (2→ 2 Ctl: 5), (2→ 2 H3: 6)) (C) Licking rate in 10 min bins over the 60 min re-exposure session. (D) Licks occurring in bursts over total licks. (E) Average burst duration. (F) Average number of licks within a burst (G) Average latency to begin licking after sucrose deliveries. (H). Average interlick interval within bursts. (I) Average burst duration after sucrose deliveries. (J) Average number of licks within a burst after an outcome delivery. (K) Percent of baseline responding during non-rewarded test. Data points and bar graphs represent the mean ±SEM. * indicates p<0.05. All t-tests for licking data and non-rewarded test day: p>0.05.
Figure 3—figure supplement 2.
Figure 3—figure supplement 2.. Left lever presses in OFC positive incentive learning during test day.
Two-way ANOVA (Food Restriction x OFC Treatment): no interactions, no main effects (Fs < 0.33, ps > 0.56).
Figure 4.
Figure 4.. Orbital frontal cortex attenuation prevents negative incentive learning.
(A) Training, re-exposure, and testing schematic showing that OFC attenuation occurred only during the re-exposure session. Group n’s: 16→ 16 Ctl: n = 18; 16→ 2 Ctl: n = 16; 16→ 16 H4: n = 16; 16→ 2 H4: n = 17.(B) Number of licks and (C) licking rate (10 min bins) during the re-exposure session. (D) The average latency to begin licking after a sucrose delivery (s). (E) The ratio of licks that occur in bursts, (F) average duration of bursts (ms), (G) average number of licks within a burst, and (H) average interlick interval within bursts (ms). (I) Average burst duration after a sucrose delivery (s) and (J) average number of licks within a burst after a sucrose delivery. (K) Response rate during the 5 min non-rewarded test as a percent of acquisition average response rate (last 2 days of training). (L) Percent of baseline responding from non-rewarded to the rewarded test. * indicates p<0.05.
Figure 4—figure supplement 1.
Figure 4—figure supplement 1.. Orbitofrontal cortex inhibition does not change sucrose preference.
(A) Control mice and OFC-attenuated mice 20% sucrose solution and water consumption during a two-bottle choice test. Two-way ANOVA, main effect of liquid: F (1, 36)=24.47, p<0.0001; no effect of OFC treatment; no interaction. (B) Control mice and OFC-attenuated mice 20% and 4% sucrose solution consumption during a two-bottle choice test. Two-way ANOVA, main effect of sucrose concentration: F (1, 36)=11.11, p=0.002; main effect of OFC treatment: F (1, 36)=8.42, p=0.006; and no interaction.
Figure 4—figure supplement 2.
Figure 4—figure supplement 2.. Left lever presses during OFC negative incentive learning test day.
Two-way ANOVA (Food Restriction x OFC Treatment): no interaction, no main effects (Fs < 2.1, ps > 0.15).
Figure 5.
Figure 5.. Optogenetic inhibition of OFC projection neurons during sucrose consumption prevents value updating.
(A) Schematic of injection site and ferrule implant (top), with DIO ChR2-eYFP detected at OFC injection site (bottom). (B) Schematic of setup of OFC excitatory neuron (OFC +) inhibition by activation of Parvalbumin (PV) interneurons (C) Confirmation of ChR2 function using ex vivo whole-cell recording. (D) Closed-loop behavioral control over light delivery. Example where in the ChR2 group the first lick after a sucrose delivery resulted in a 5 s light delivery (5 ms pulse, 20 Hz) (left), while Yoked group received light delivery at the same time independent of licking behavior (right). Group n’s: ChR2: n = 8, Yoked: n = 8. (E) Number of licks and (F) licking rate (10 min bins) during the re-exposure session. (G) The average latency to begin licking after a sucrose delivery (s). (H) The ratio of licks that occur in bursts, (I) average duration of bursts (ms), (J) average number of licks within a burst, and (K) average interlick interval within bursts (ms). (L) Average burst duration after a sucrose delivery (s) and (M) average number of licks within a burst after a sucrose delivery. (N) Response rate during the 5 min non-rewarded test as a percent of acquisition response rate (last 2 days of training). Data points and bar graphs represent the mean ± SEM. * indicates p<0.05.

Similar articles

Cited by

References

    1. Balleine B, Dickinson A. Instrumental performance following reinforcer devaluation depends upon incentive learning. Quarterly Journal of Experimental Psychology B. 1991;43:279–296.
    1. Balleine BW, Dickinson A. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology. 1998;37:407–419. doi: 10.1016/S0028-3908(98)00033-1. - DOI - PubMed
    1. Balleine BW, Dickinson A. Effects of outcome devaluation on the performance of a heterogeneous instrumental chain. International Journal of Comparative Psychology. 2005;18
    1. Balleine BW, Garner C, Gonzalez F, Dickinson A. Motivational control of heterogeneous instrumental chains. Journal of Experimental Psychology: Animal Behavior Processes. 1995;21:203–217. doi: 10.1037/0097-7403.21.3.203. - DOI
    1. Baltz ET, Gremel CM. arduinoLEDcontrol. 362354fGitHub. 2017 https://github.com/gremellab/arduinoLEDcontrol

Publication types