Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Case Reports
. 2014 Dec;137(Pt 12):3129-35.
doi: 10.1093/brain/awu277. Epub 2014 Oct 1.

Dorsal striatum is necessary for stimulus-value but not action-value learning in humans

Affiliations
Case Reports

Dorsal striatum is necessary for stimulus-value but not action-value learning in humans

Khoi Vo et al. Brain. 2014 Dec.

Abstract

Several lines of evidence implicate the striatum in learning from experience on the basis of positive and negative feedback. However, the necessity of the striatum for such learning has been difficult to demonstrate in humans, because brain damage is rarely restricted to this structure. Here we test a rare individual with widespread bilateral damage restricted to the dorsal striatum. His performance was impaired and not significantly different from chance on several classic learning tasks, consistent with current theories regarding the role of the striatum. However, he also exhibited remarkably intact performance on a different subset of learning paradigms. The tasks he could perform can all be solved by learning the value of actions, while those he could not perform can only be solved by learning the value of stimuli. Although dorsal striatum is often thought to play a specific role in action-value learning, we find surprisingly that dorsal striatum is necessary for stimulus-value but not action-value learning in humans.

Keywords: action-value; reinforcement learning; stimulus-value; striatum.

PubMed Disclaimer

Figures

Figure 1
Figure 1
MRI from the acute phase of Patient XG’s injury (top row) and more recently (bottom row). Three different contrasts (FLAIR, T2, T1 with gadolinium) are shown. Going left to right, axial images progress from inferior to superior and coronal images progress from posterior to anterior. In the acute images, contrast enhancement can be seen in the caudate and putamen bilaterally, indicative of recent injury (i.e. inflammation and breakdown of the blood–brain barrier). More recent images show pronounced loss of tissue in the caudate and putamen bilaterally. Both sets of images show sparing of ventral regions of the striatum, including the nucleus accumbens.
Figure 2
Figure 2
(A) Patient XG and healthy controls’ performance in the Weather Prediction Task. Plotted is the proportion of times participants chose the more likely option (‘rain’ or ‘shine’) given the cue combination. (B) Patient XG and healthy controls’ performance in the training phase of the Probabilistic Selection Task. Plotted is the proportion of times participants chose the higher rewarded option (A, C and E) from each pair (AB, CD and EF) and across all pairs during the training phase. (C) Patient XG and healthy controls’ performance in the test phase of the Probabilistic Selection Task. Plotted is the proportion of times participants choose A from novel pairs, avoid B from novel pairs, and choose the highest stimulus value across all novel pairs. In A–C, error bars denote standard deviation and asterisks denote P < 0.05. (D, E and F) Patient XG and healthy controls’ performance for Crab Game, Fish Game, and Bait Game, respectively. Plotted is the probability of choosing the richer option across the course of a block (averaging over 16 total blocks per task, representing 14 total transitions). Vertical dashed line denotes block transition. Horizontal line denotes chance performance. Data smoothing kernel = 11.
Figure 3
Figure 3
(A and C) Patient XG and healthy controls’ performance for the stimulus-value learning (A) and action-value learning tasks (C), respectively. Plotted is the probability of choosing the richer option across the course of a block (averaging over 16 total blocks per task, representing 14 total transitions). Vertical dashed line denotes block transition. Horizontal line denotes chance performance. Data smoothing kernel = 11. (B and D) General linear model fit showing the influences of past rewards on learning for stimulus-value learning (B) and action-value learning (D). Control fits in both tasks demonstrate the signature of reinforcement learning. However, Patient XG only shows fits consistent with reinforcement learning in action-value learning, and not in stimulus-value learning.
Figure 4
Figure 4
(A) Summary of Patient XG and healthy controls’ performance across all learning tasks. (B) Summary of Patient XG and healthy controls’ learning rates. Learning rates are estimated in a basic reinforcement learning model fit to behaviour. Tasks that are solvable only by learning stimulus values are highlighted in yellow. Asterisks denote P < 0.05.

Comment in

Similar articles

Cited by

References

    1. Balleine BW, O'doherty JP. Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology. 2010;35:48–69. - PMC - PubMed
    1. Cai X, Kim S, Lee D. Heterogeneous coding of temporally discounted values in the dorsal and ventral striatum during intertemporal choice. Neuron. 2011;69:170–82. - PMC - PubMed
    1. Camille N, Tsuchida A, Fellows LK. Double dissociation of stimulus-value and action-value learning in humans with orbitofrontal or anterior cingulate damage. J Neurosci. 2011;31:15048–52. - PMC - PubMed
    1. Caramazza A, McCloskey M. The case for single-patient studies. Cogn Neuropsychol. 1988;5:517–27.
    1. Cohen MX, Frank MJ. Neurocomputational models of basal ganglia function in learning, memory and choice. Behav Brain Res. 2009;199:141–56. - PMC - PubMed

Publication types