Combined model-free and model-sensitive reinforcement learning in non-human primates
- PMID: 32569311
- PMCID: PMC7332075
- DOI: 10.1371/journal.pcbi.1007944
Combined model-free and model-sensitive reinforcement learning in non-human primates
Abstract
Contemporary reinforcement learning (RL) theory suggests that potential choices can be evaluated by strategies that may or may not be sensitive to the computational structure of tasks. A paradigmatic model-free (MF) strategy simply repeats actions that have been rewarded in the past; by contrast, model-sensitive (MS) strategies exploit richer information associated with knowledge of task dynamics. MF and MS strategies should typically be combined, because they have complementary statistical and computational strengths; however, this tradeoff between MF/MS RL has mostly only been demonstrated in humans, often with only modest numbers of trials. We trained rhesus monkeys to perform a two-stage decision task designed to elicit and discriminate the use of MF and MS methods. A descriptive analysis of choice behaviour revealed directly that the structure of the task (of MS importance) and the reward history (of MF and MS importance) significantly influenced both choice and response vigour. A detailed, trial-by-trial computational analysis confirmed that choices were made according to a combination of strategies, with a dominant influence of a particular form of model sensitivity that persisted over weeks of testing. The residuals from this model necessitated development of a new combined RL model which incorporates a particular credit assignment weighting procedure. Finally, response vigor exhibited a subtly different collection of MF and MS influences. These results provide new illumination onto RL behavioural processes in non-human primates.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
Similar articles
-
Human subjects exploit a cognitive map for credit assignment.Proc Natl Acad Sci U S A. 2021 Jan 26;118(4):e2016884118. doi: 10.1073/pnas.2016884118. Proc Natl Acad Sci U S A. 2021. PMID: 33479182 Free PMC article.
-
Adaptive coordination of working-memory and reinforcement learning in non-human primates performing a trial-and-error problem solving task.Behav Brain Res. 2018 Dec 14;355:76-89. doi: 10.1016/j.bbr.2017.09.030. Epub 2017 Oct 20. Behav Brain Res. 2018. PMID: 29061387
-
Model-based reinforcement learning under concurrent schedules of reinforcement in rodents.Learn Mem. 2009 Apr 29;16(5):315-23. doi: 10.1101/lm.1295509. Print 2009 May. Learn Mem. 2009. PMID: 19403794
-
Mechanisms of reinforcement learning and decision making in the primate dorsolateral prefrontal cortex.Ann N Y Acad Sci. 2007 May;1104:108-22. doi: 10.1196/annals.1390.007. Epub 2007 Mar 8. Ann N Y Acad Sci. 2007. PMID: 17347332 Review.
-
Modelling ADHD: A review of ADHD theories through their predictions for computational models of decision-making and reinforcement learning.Neurosci Biobehav Rev. 2016 Dec;71:633-656. doi: 10.1016/j.neubiorev.2016.09.002. Epub 2016 Sep 5. Neurosci Biobehav Rev. 2016. PMID: 27608958 Review.
Cited by
-
Value representations in the rodent orbitofrontal cortex drive learning, not choice.Elife. 2022 Aug 17;11:e64575. doi: 10.7554/eLife.64575. Elife. 2022. PMID: 35975792 Free PMC article.
-
The Anterior Cingulate Cortex Predicts Future States to Mediate Model-Based Action Selection.Neuron. 2021 Jan 6;109(1):149-163.e7. doi: 10.1016/j.neuron.2020.10.013. Epub 2020 Nov 4. Neuron. 2021. PMID: 33152266 Free PMC article.
-
Neural dynamics in the orbitofrontal cortex reveal cognitive strategies.bioRxiv [Preprint]. 2024 Oct 29:2024.10.29.620879. doi: 10.1101/2024.10.29.620879. bioRxiv. 2024. PMID: 39554155 Free PMC article. Preprint.
-
Anterior cingulate learns reward distribution.Nat Neurosci. 2024 Mar;27(3):391-392. doi: 10.1038/s41593-024-01571-0. Nat Neurosci. 2024. PMID: 38351324 No abstract available.
-
Distinct value computations support rapid sequential decisions.Nat Commun. 2023 Nov 21;14(1):7573. doi: 10.1038/s41467-023-43250-x. Nat Commun. 2023. PMID: 37989741 Free PMC article.
References
-
- Sutton RS, Barto AG. Introduction to Reinforcement Learning. 1st ed Cambridge, MA, USA: MIT Press; 1998.
-
- Tolman EC. Cognitive maps in rats and men. Psychological review. 1948;55(4):189–208. - PubMed
-
- Dickinson A. Actions and Habits: The Development of Behavioural Autonomy. Philosophical Transactions of the Royal Society of London B, Biological Sciences. 1985;308(1135):67–78. 10.1098/rstb.1985.0010 - DOI
-
- Dickinson A, Balleine B. Motivational control of goal-directed action. Animal Learning & Behavior. 1994;22(1):1–18.
-
- Thorndike EL. Animal intelligence;. New York,The Macmillan company,; 1911. Available from: http://www.biodiversitylibrary.org/item/16001.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous
