Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 May;237(5):1267-1280.
doi: 10.1007/s00213-020-05454-7. Epub 2020 Feb 6.

Modulation of value-based decision making behavior by subregions of the rat prefrontal cortex

Affiliations

Modulation of value-based decision making behavior by subregions of the rat prefrontal cortex

Jeroen P H Verharen et al. Psychopharmacology (Berl). 2020 May.

Abstract

Rationale: During value-based decision-making, organisms make choices on the basis of reward expectations, which have been formed during prior action-outcome learning. Although it is known that neuronal manipulations of different subregions of the rat prefrontal cortex (PFC) have qualitatively different effects on behavioral tasks involving value-based decision-making, it is unclear how these regions contribute to the underlying component processes.

Objectives: Assessing how different regions of the rodent PFC contribute to component processes of value-based decision-making behavior, including reward (or positive feedback) learning, punishment (or negative feedback) learning, response persistence, and exploration versus exploitation.

Methods: We performed behavioral modeling of data of rats in a probabilistic reversal learning task after pharmacological inactivation of five PFC subregions, to assess how inactivation of these different regions affected the structure of responding of animals in the task.

Results: Our results show reductions in reward and punishment learning after PFC subregion inactivation. The prelimbic, infralimbic, lateral orbital, and medial orbital PFC particularly contributed to punishment learning, and the prelimbic and lateral orbital PFC to reward learning. In addition, response persistence depended on the infralimbic and medial orbital PFC. As a result, pharmacological inactivation of the infralimbic and lateral orbitofrontal cortex reduced the number of reversals achieved, whereas inactivation of the prelimbic and medial orbitofrontal cortex decreased the number of rewards obtained. Finally, using simulated data, we explain discrepancies with a previous study and demonstrate complex, interacting relationships between conventional measures of probabilistic reversal learning performance, such as win-stay/lose-switch behavior, and component processes of value-based decision-making.

Conclusions: Together, our data suggest that distinct components of value-based learning and decision-making are generated in medial and orbital PFC regions, displaying functional specialization and overlap, with a prominent role of large parts of the PFC in negative feedback processing.

Keywords: Behavioral modeling; Decision-making; Prefrontal cortex; Punishment; Rats; Reinforcement learning; Reward; Value.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Effects of PFC inactivation on probabilistic reversal learning. a Probabilistic reversal learning setup. b Example session of one rat. c Effects of PFC inactivation on probabilistic reversal learning. ACC, n = 10 rats; PrL, n = 12 rats; IL, n = 9; mOFC, n = 9; lOFC, n = 9. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001 (post hoc Holm-Sidak test; see also the Supplementary statistics table in Online Resource 1; for infusion sites see Online Resource 3)
Fig. 2
Fig. 2
Behavioral model selection. a We fit several reinforcement learning models to our data and estimated which model (i.e., strategy) best described the animals’ behavior. Numbers in parentheses refer to the number of free parameters in the model (see also Online Resources 2 (model equations), 4 (table of model selection), and 5 (model selection per inactivation condition)). b The “winning” model was a Rescorla-Wagner model (RW3), in which the animals track the value of both nose pokes over an extended history of outcomes by learning from reward and punishment (i.e., reward versus reward omission)
Fig. 3
Fig. 3
Model coefficients. Best-fit model parameters for each session. Inactivation of the PrL and lOFC impaired reward and punishment learning, whereas inactivation of the IL and mOFC impaired punishment learning and reduced choice perseveration (i.e., repeated choices for the same nose poke hole). ACC, n = 10 rats; PrL, n = 12 rats; IL, n = 9; mOFC, n = 9; lOFC, n = 9. *P < 0.05, **P < 0.01, ***P < 0.001 (post hoc Holm-Sidak test; see also Online Resource 1)
Fig. 4
Fig. 4
Visual summary. PFC subregions have distinct, albeit overlapping, functions in value-based behaviors. All regions except the ACC are involved in punishment learning. Shown is Z-score of B/M effect ((meanBM − meanSal)/SDsal)
Fig. 5
Fig. 5
Simulated data. We simulated probabilistic reversal learning sessions (50 simulations per condition (i.e., per pixel), 200 trials per session), to assess how changes in the computational model parameters affect conventional measures of task performance in the (simulated) data. Explore/exploit parameter β was fixed at the animals’ grand average from the experimental data (β = 1.686). The highest number of reversals can be obtained by a combination of high learning and high stickiness, whereas the number of rewarded trials could be maximized by having high learning and an intermediate value of the stickiness parameter. Win-stay/lose-switch measures were mostly dependent on the stickiness parameter, but can be modulated by both reward and punishment learning. Thus, whether changes in the computational model parameters lead to significant changes in conventional task measures is highly dependent on the baseline behavior of the animal and size and direction of the effects.

Similar articles

Cited by

References

    1. Bari A, et al. Serotonin modulates sensitivity to reward and negative feedback in a probabilistic reversal learning task in rats. Neuropsychopharmacology. 2010;35:1290–1301. doi: 10.1038/npp.2009.233. - DOI - PMC - PubMed
    1. Bechara A, Van Der Linden M. Decision-making and impulse control after frontal lobe injuries. Curr Op Neurol. 2005;18:734–739. doi: 10.1097/01.wco.0000194141.56429.3c. - DOI - PubMed
    1. Birrell JM, Brown VJ. Medial frontal cortex mediates perceptual attentional set shifting in the rat. J Neurosci. 2000;20:4320–4324. doi: 10.1523/JNEUROSCI.20-11-04320.2000. - DOI - PMC - PubMed
    1. Burgos-Robles A, Bravo-Rivera H, Quirk GJ. Prelimbic and infralimbic neurons signal distinct aspects of appetitive instrumental behavior. PLoS One. 2013;8:e57575. doi: 10.1371/journal.pone.0057575. - DOI - PMC - PubMed
    1. Chudasama Y, Robbins TW. Dissociable contributions of the orbitofrontal and infralimbic cortex to Pavlovian autoshaping and discrimination reversal learning: further evidence for the functional heterogeneity of the rodent frontal cortex. J Neurosci. 2003;23:8771–8780. doi: 10.1523/JNEUROSCI.23-25-08771.2003. - DOI - PMC - PubMed