Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2016 Jun 1;524(8):1699-711.
doi: 10.1002/cne.23880. Epub 2015 Sep 8.

Components and characteristics of the dopamine reward utility signal

Affiliations
Review

Components and characteristics of the dopamine reward utility signal

William R Stauffer et al. J Comp Neurol. .

Abstract

Rewards are defined by their behavioral functions in learning (positive reinforcement), approach behavior, economic choices, and emotions. Dopamine neurons respond to rewards with two components, similar to higher order sensory and cognitive neurons. The initial, rapid, unselective dopamine detection component reports all salient environmental events irrespective of their reward association. It is highly sensitive to factors related to reward and thus detects a maximal number of potential rewards. It also senses aversive stimuli but reports their physical impact rather than their aversiveness. The second response component processes reward value accurately and starts early enough to prevent confusion with unrewarded stimuli and objects. It codes reward value as a numeric, quantitative utility prediction error, consistent with formal concepts of economic decision theory. Thus, the dopamine reward signal is fast, highly sensitive and appropriate for driving and updating economic decisions.

Keywords: neuroeconomics; risk; stimulus components; subjective value; temporal discounting; utility.

PubMed Disclaimer

Conflict of interest statement

Conflict of interests: The authors declare no conflict of interest.

Figures

Figure 1
Figure 1. Stimulus components and their neuronal processing
A: Scheme of sequential processing steps of individual stimulus components. B: Time course of target discrimination during visual search in monkey frontal eye fields neuron. The response initially detects the stimulus indiscriminately (blue zone) and only later differentiates between target and distractor (red). From Thompson et al. (1996). C: Distinction of initial indiscriminate detection response (blue) from main response component coding reward prediction error (red) in monkey dopamine neurons during temporal discounting. Reward value increases from blue via orange and green to red, inversely with delays of 2, 4, 8 and 16 s. From Kobayashi & Schultz (2008). D: Better distinction of the two dopamine response components in more demanding random dot motion discrimination task. Better dot motion discrimination with increasing motion coherence (0%, 50%) results in increasing reward probability (from p=0.49 to p=0.99). Neuronal activity shows an initial, non-differential increase (blue), a decrease back to baseline, and then a second, graded increase reflecting reward value (due to increasing reward probability, red). Vertical dotted line marks onset of discriminating ocular saccade and indicates that assessment of the reward value of the identified motion direction requires several hundred milliseconds. From Nomoto et al. (2010).
Figure 2
Figure 2. High sensitivity of initial dopamine detection response component
Left: Enhancement by reward generalization. In the red trials, both rewarded and aversive conditioned stimuli are visual. In the blue trials (covering large parts of red trials except the peak), the conditioned aversive stimulus remains visual, but the rewarded conditioned stimulus is auditory. The activating response to the identical visual aversive stimulus is higher when the rewarded stimulus is also visual (red peak) rather than auditory (blue), demonstrating response enhancement by sensory similarity with rewarded stimulus. The blue activity depression reflects the second component. From Mirenowicz & Schultz (1996). Right: Enhancement by reward context. Left: in an experiment that separates unrewarded from rewarded contexts, dopamine neurons show only small activations to unrewarded large and small pictures (blue and black; red: response to liquid reward). Three distinct contexts are achieved by three well separated trial blocks, three different background pictures and removal of liquid spout in the picture trial types (center and bottom, blue and black). Right: by contrast, in an experiment using a common reward context without these separations, dopamine neurons show substantial activations to unrewarded large and small pictures. Each of the six picture pairs shows the trial background on a large computer monitor (left) and the continuing background together with the specific reward or superimposed picture (right). Each of the six neuronal traces shows the average population response from 31–33 monkey dopamine neurons. From Kobayashi & Schultz (2014).
Figure 3
Figure 3
Accurate dopamine value coding after initial detection response. Both rewarded and unrewarded conditioned stimuli (CS+, CS−) elicit a common initial increase of neuronal activity (blue). This activation continues after the CS+ (top, red), but turns into a depression after the CS− (bottom, red). In CS+ trials, the fully predicted reward elicits no response (no prediction error, right), whereas in CS- trials, a surprising (identical) test reward induces an activation (positive prediction error). Thus, correct, positive or negative reward value coding begins immediately after the common initial response and early enough for initiating corresponding behavioral reactions (green arrow); correct value coding continues until the time of reward (red arrow). From Waelti et al. (2001).
Figure 4
Figure 4
Phasic activation of dopamine neurons to aversive stimulus reflects physical salience rather than aversiveness. The increased aversiveness generated by the more concentrated bitter decatonium solution decreases the dopamine response (physical impact of liquid delivery remains constant), suggesting an inverse relationship between aversiveness and dopamine activation. The increased depression from the higher aversiveness reduces the activation generated by the physical stimulation from the liquid drops. Average population responses from 19 and 14 monkey dopamine neurons, respectively.b From Forillo et al. (2013b).
Figure 5
Figure 5. Dopamine neurons code subjective rather than objective reward value
A: Neuronal coding of common currency subjective value. Stimulus responses follow preferences among different liquid and food rewards. Rewards were different quantities of blackcurrant juice (top: blue) and liquified mixture of banana, chocolate and hazelnut food (yellow banana), color bars below rewards at top refer to color of neuronal responses, curved arrows indicate behavioral preferences assessed in binary behavioral choices between the indicated rewards. B: Increase of stimulus responses with risky compared to safe rewards (vertical arrows). Blue and green colors indicate blackcurrant juice (more preferred = higher value) and orange juice (less preferred = lower value), respectively, S and R indicate safe reward amounts and binary, equiprobable gambles between two reward amounts of same reward juice with identical expected value, respectively. A and B from Lak et al. (2014). C: Temporal discounting: decreasing responses of dopamine neurons to stimuli predicting increasing reward delays of 2–16 s (red), corresponding to subjective value decrements measured by intertemporal choices (blue), contrasted with constant physical amount (black). Y-axis shows behavioral value and neuronal responses in % of reward amount at 2 s delay (0.56 ml). From Kobayashi & Schultz (2008).
Figure 6
Figure 6. Dopamine neurons code formal economic utility
A: Positive utility prediction error responses to unpredicted juice rewards (black), superimposed on nonlinear utility function in same monkey (red). Psychophysically varied behavioral choices between a variable safe reward and a specific binary, equiprobable gamble (p=0.5 each outcome) served to assess its certainty equivalent (subjective value of gamble indicated by amount of safe reward at choice indifference); the certainty equivalents of specifically placed gambles served to estimate the utility function according to the structured 'fractile' procedure (Caraco 1980; Machina 1987). B: Top: three conditioned stimuli indicating three binary, equiprobable gambles (0.1–0.4 ml; 0.5–0.8 ml; 0.9–1.2 ml juice); bar height specifies juice volume. In pseudorandom alternation, one of these stimuli is shown to the animal, followed 1.5 s later by one of the two specified juice volumes. Bottom: nonlinear utility function (same as in A). Delivery of higher reward in each gamble generates identical positive physical prediction error (0.15 ml, red, black and blue dots). However, due to different positions on the utility function, the prediction errors vary non-monotonically in utility ΔRu). Shaded areas indicate physical volumes (horizontal) and utilities (vertical) of gambles. C: Dopamine coding of utility prediction error (same animal as in B). The red, black and blue traces indicate responses to the higher outcomes of the three gambles shown as colored fat dots in B (0.4 ml, 0.8 ml, 1.2 ml). These responses reflect the positive utility prediction errors that vary according to the slope of the utility function (ΔRu in B), rather than the identical positive physical prediction errors of +0.15. A–C from Stauffer et al. (2014).

Similar articles

Cited by

References

    1. Ambroggi F, Ishikawa A, Fields HL, Nicola SM. Basolateral amygdala neurons facilitate reward-seeking behavior by exciting nucleus accumbens neurons. Neuron. 2008;59:648–661. - PMC - PubMed
    1. Arsenault JT, Rima S, Stemmann H, Vanduffel W. Role of the primate ventral tegmental area in reinforcement and motivation. Curr Biol. 2014;24:1347–1353. - PMC - PubMed
    1. Budygin EA, Park J, Bass CE, Grinevich VP, Bonin KD, Wightman RM. Aversive stimulus differnentially triggers subsecond dopamine release in reward regions. Neuroscience. 2012;201:331–337. - PMC - PubMed
    1. Bushnell MC, Goldberg ME, Robinson DL. Behavioral enhancement of visual responses in monkey cerebral cortex. I. Modulation in posterior parietal cortex related to selective visual attention. J Neurophysiol. 1981;46:755–772. - PubMed
    1. Chuhma N, Mingote S, Moore H, Rayport S. Dopamine neurons control striatal cholinergic neurons via regionally heterogeneous dopamine and glutamate signaling. Neuron. 2014;81:901–912. - PMC - PubMed

Publication types

LinkOut - more resources