Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2007 Nov 21;27(47):12860-7.
doi: 10.1523/JNEUROSCI.2496-07.2007.

Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making

Affiliations
Comparative Study

Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making

Tom Schönberg et al. J Neurosci. .

Abstract

The computational framework of reinforcement learning has been used to forward our understanding of the neural mechanisms underlying reward learning and decision-making behavior. It is known that humans vary widely in their performance in decision-making tasks. Here, we used a simple four-armed bandit task in which subjects are almost evenly split into two groups on the basis of their performance: those who do learn to favor choice of the optimal action and those who do not. Using models of reinforcement learning we sought to determine the neural basis of these intrinsic differences in performance by scanning both groups with functional magnetic resonance imaging. We scanned 29 subjects while they performed the reward-based decision-making task. Our results suggest that these two groups differ markedly in the degree to which reinforcement learning signals in the striatum are engaged during task performance. While the learners showed robust prediction error signals in both the ventral and dorsal striatum during learning, the nonlearner group showed a marked absence of such signals. Moreover, the magnitude of prediction error signals in a region of dorsal striatum correlated significantly with a measure of behavioral performance across all subjects. These findings support a crucial role of prediction error signals, likely originating from dopaminergic midbrain neurons, in enabling learning of action selection preferences on the basis of obtained rewards. Thus, spontaneously observed individual differences in decision making performance demonstrate the suggested dependence of this type of learning on the functional integrity of the dopaminergic striatal system in humans.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
A, General outline of a trial in the card-betting task. The task contained four decks of cards. Each deck had a predefined probability of winning of either 75, 60, 40, or 25%. On each trial, subjects had to choose one of the four decks. Participants were unaware of the probability assigned to each deck. B, Subjects' performance during fMRI scanning. Separation into two groups was based on subjects' choices on the two HP decks (75 and 60%) in the last 40 trials of the task. C–E, Postexperiment ratings show significant interaction between groups (learners and nonlearners) in pleasantness ratings (C) and preference rankings (D). E, In the probability assessment question, a significant linear trend is seen in the learners group but not in the nonlearners group.
Figure 2.
Figure 2.
Random effects analysis showing PE correlations in ventral and dorsal striatum. A, The learners group showed significant correlations in bilateral ventral striatum and right dorsal striatum (n = 17; p < 0.001). B, The nonlearners group did not show significant correlations in a similar threshold (n = 12; p < 0.001). C, A direct comparison between PE correlated activity in the learners group and the nonlearners group, showed enhanced activity in learners compared with nonlearners in right dorsal striatum (p < 0.001). D, Parameter estimates of the direct comparison between learners and nonlearners. E, F, Time courses of the two groups in the right dorsal striatum during trials with high PE (E) and negative PE (F) learners show stronger activity in both of these trial types than nonlearners.
Figure 3.
Figure 3.
Second-level analysis showing simple regression between the learning criterion and PE in right dorsal striatum. A, Simple regression analysis shows correlation in right dorsal striatum between the learning criterion used (number of choices on the two HP decks in the last 40 trials of the task) and PE contrast maps of each subject. B, Scatterplot of the learning criterion and parameter estimates in the simple regression analysis shown in A.

Similar articles

Cited by

References

    1. Beck A. Beck depression inventory. London: The Psychological Corporation; 1988.
    1. Beck AT, Ward CH, Mendelson M, Mock J, Erbaugh J. An inventory for measuring depression. Arch Gen Psychiatry. 1961;4:561–571. - PubMed
    1. Cools R, Robbins TW. Chemistry of the adaptive mind. Philos Transact A Math Phys Eng Sci. 2004;362:2871–2888. - PubMed
    1. Costa PT, Jr, McCrae RR. Normal personality assessment in clinical practice: the NEO personality inventory. Psychol Assess. 1992a;4:5–13.
    1. Costa PT, Jr, McCrae RR. Revised NEO personality inventory (NEO-PI-R) and NEO five-factor inventory (NEO-FFI) professional manual. Odessa, FL: Psychological Assessment Resources; 1992b.

Publication types

LinkOut - more resources