Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making
- PMID: 18032658
- PMCID: PMC6673291
- DOI: 10.1523/JNEUROSCI.2496-07.2007
Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making
Abstract
The computational framework of reinforcement learning has been used to forward our understanding of the neural mechanisms underlying reward learning and decision-making behavior. It is known that humans vary widely in their performance in decision-making tasks. Here, we used a simple four-armed bandit task in which subjects are almost evenly split into two groups on the basis of their performance: those who do learn to favor choice of the optimal action and those who do not. Using models of reinforcement learning we sought to determine the neural basis of these intrinsic differences in performance by scanning both groups with functional magnetic resonance imaging. We scanned 29 subjects while they performed the reward-based decision-making task. Our results suggest that these two groups differ markedly in the degree to which reinforcement learning signals in the striatum are engaged during task performance. While the learners showed robust prediction error signals in both the ventral and dorsal striatum during learning, the nonlearner group showed a marked absence of such signals. Moreover, the magnitude of prediction error signals in a region of dorsal striatum correlated significantly with a measure of behavioral performance across all subjects. These findings support a crucial role of prediction error signals, likely originating from dopaminergic midbrain neurons, in enabling learning of action selection preferences on the basis of obtained rewards. Thus, spontaneously observed individual differences in decision making performance demonstrate the suggested dependence of this type of learning on the functional integrity of the dopaminergic striatal system in humans.
Figures
Similar articles
-
Beta Oscillations in Monkey Striatum Encode Reward Prediction Error Signals.J Neurosci. 2023 May 3;43(18):3339-3352. doi: 10.1523/JNEUROSCI.0952-22.2023. Epub 2023 Apr 4. J Neurosci. 2023. PMID: 37015808 Free PMC article.
-
Signals in human striatum are appropriate for policy update rather than value prediction.J Neurosci. 2011 Apr 6;31(14):5504-11. doi: 10.1523/JNEUROSCI.6316-10.2011. J Neurosci. 2011. PMID: 21471387 Free PMC article.
-
The contribution of striatal pseudo-reward prediction errors to value-based decision-making.Neuroimage. 2019 Jun;193:67-74. doi: 10.1016/j.neuroimage.2019.02.052. Epub 2019 Mar 7. Neuroimage. 2019. PMID: 30851446
-
[Reinforcement learning by striatum].Brain Nerve. 2009 Apr;61(4):405-11. Brain Nerve. 2009. PMID: 19378810 Review. Japanese.
-
Reward-dependent learning in neuronal networks for planning and decision making.Prog Brain Res. 2000;126:217-29. doi: 10.1016/S0079-6123(00)26016-0. Prog Brain Res. 2000. PMID: 11105649 Review.
Cited by
-
Choice perseverance underlies pursuing a hard-to-get target in an avatar choice task.Front Psychol. 2022 Sep 6;13:924578. doi: 10.3389/fpsyg.2022.924578. eCollection 2022. Front Psychol. 2022. PMID: 36148109 Free PMC article.
-
Anterior prefrontal cortex contributes to action selection through tracking of recent reward trends.J Neurosci. 2012 Jun 20;32(25):8434-42. doi: 10.1523/JNEUROSCI.5468-11.2012. J Neurosci. 2012. PMID: 22723683 Free PMC article.
-
Dissociating hippocampal and striatal contributions to sequential prediction learning.Eur J Neurosci. 2012 Apr;35(7):1011-23. doi: 10.1111/j.1460-9568.2011.07920.x. Eur J Neurosci. 2012. PMID: 22487032 Free PMC article. Clinical Trial.
-
A reinforcement learning model with choice traces for a progressive ratio schedule.Front Behav Neurosci. 2024 Jan 10;17:1302842. doi: 10.3389/fnbeh.2023.1302842. eCollection 2023. Front Behav Neurosci. 2024. PMID: 38268795 Free PMC article.
-
Interindividual variability in functional connectivity as long-term correlate of temporal discounting.PLoS One. 2015 Mar 16;10(3):e0119710. doi: 10.1371/journal.pone.0119710. eCollection 2015. PLoS One. 2015. PMID: 25774886 Free PMC article.
References
-
- Beck A. Beck depression inventory. London: The Psychological Corporation; 1988.
-
- Beck AT, Ward CH, Mendelson M, Mock J, Erbaugh J. An inventory for measuring depression. Arch Gen Psychiatry. 1961;4:561–571. - PubMed
-
- Cools R, Robbins TW. Chemistry of the adaptive mind. Philos Transact A Math Phys Eng Sci. 2004;362:2871–2888. - PubMed
-
- Costa PT, Jr, McCrae RR. Normal personality assessment in clinical practice: the NEO personality inventory. Psychol Assess. 1992a;4:5–13.
-
- Costa PT, Jr, McCrae RR. Revised NEO personality inventory (NEO-PI-R) and NEO five-factor inventory (NEO-FFI) professional manual. Odessa, FL: Psychological Assessment Resources; 1992b.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources