. 2010 Aug 12;67(3):499-510.
Multiple Timescales of Memory in Lateral Habenula and Dopamine Neurons
Free PMC article
Item in Clipboard
Multiple Timescales of Memory in Lateral Habenula and Dopamine Neurons
Free PMC article
Midbrain dopamine neurons are thought to signal predictions about future rewards based on the memory of past rewarding experience. Little is known about the source of their reward memory and the factors that control its timescale. Here we recorded from dopamine neurons, as well as one of their sources of input, the lateral habenula, while animals predicted upcoming rewards based on the past reward history. We found that lateral habenula and dopamine neurons accessed two distinct reward memories: a short-timescale memory expressed at the start of the task and a near-optimal long-timescale memory expressed when a future reward outcome was revealed. The short- and long-timescale memories were expressed in different forms of reward-oriented eye movements. Our data show that the habenula-dopamine pathway contains multiple timescales of memory and provide evidence for their role in motivated behavior.
(c) 2010 Elsevier Inc. All rights reserved.
Figure 1. Behavioral task
(A) Task diagram. The animal was required to fixate a spot of light, then follow the spot with a saccade when it stepped to the left or right side of the screen. In each block of 24 trials, saccades to one target direction were rewarded, while saccades to the other direction were unrewarded. (B) The task used a pseudorandom reward schedule in which the reward probability could be predicted with high accuracy as a weighted linear combination of past outcomes plus a constant factor. (C) The optimal weights (black dots) for each past reward outcome. The optimal weights were similar when constrained to take the form of an exponential decay (gray line). (D) Plot of true reward probability against predicted reward probability using the optimal exponentially decaying linear weights. Each dot represents one of the fifty possible six-trial reward histories in the pseudorandom schedule. The predicted reward probability was highly correlated with the true reward probability. (See also Figure S1)
Figure 2. Behavioral memory for a single previous outcome
(A) Trace of horizontal eye position during two example rewarded trials, when the past trial was rewarded (Past R, red) or unrewarded (Past U, blue). Gray bars indicate the fixation point and saccade target. Left: eye position aligned at the time of fixation point onset. Right: eye position aligned at target onset. Inset: eye position aligned at target onset, showing a small bias in eye position towards the location of the rewarded target. (B) Measures of behavioral performance, separately for trials when the past trial was rewarded (red) or unrewarded (blue). Target RT bias is the mean difference in reaction time between saccades to unrewarded targets vs. rewarded targets. Bars are 80% bootstrap confidence intervals. Asterisks indicate statistical significance. ** indicates
p < 10 −4 in combined data, p < 0.05 in monkey L; *** indicates p < 10 −4 in combined data, p < 0.05 in monkey L, p < 0.05 in monkey E; bootstrap test. The memory for past outcomes influenced behavioral performance at all times during the trial. (See also Figure S2)
Figure 3. Neural memory for a single previous outcome
(A) Population average firing rate of lateral habenula neurons ( LHb ) when the past trial was rewarded (red) or unrewarded (blue). Firing rates were smoothed with a Gaussian kernel (σ = 15 ms). Colored bars on the bottom of each plot indicate times when the past trial outcome had a significant effect on neural activity (
p < 0.01, paired Wilcoxon signed-rank test). (B) Same as (A), for dopamine neurons ( DA ). Lateral habenula and dopamine neurons had opposite mean response directions and opposite past-outcome effects during all three task events. (C) Schematic illustration of theoretical reward predictions at each time during the trial (see text for full description). When the reward prediction increased (upward arrows, positive prediction errors), lateral habenula neurons were inhibited and dopamine neurons were excited; when the reward prediction decreased (downward arrows, negative prediction errors), lateral habenula neurons were excited and dopamine neurons were inhibited. (See also Figure S3)
Figure 4. Multiple timescales of memory
(A–B) Memory effects in lateral habenula neurons (A) and dopamine neurons (B). Each panel shows the population average past-outcome effects – the difference in firing rate depending on whether a past outcome was rewarded or unrewarded (“Past R – Past U”), derived from the parameters of the fitted model described in the main text. Colored lines are the firing rate differences for specific past outcomes (black, red, orange, yellow = 1-, 2-, 3-, 4-trials-ago outcomes). The analysis was performed in a 151 ms sliding window advanced in 20 ms steps. Dark gray bars at the bottom of the plot indicate times when the population average memory amplitude was significantly different from zero, using the version of the memory model in which the weights followed an exponential decay (
p < 0.01, Wilcoxon signed-rank test). Light gray bars below the axes are the time windows used for the analysis in Figure 5. Both lateral habenula and dopamine neurons had one-trial memories in response to the fixation point, but multiple-trial memories in response to the targets. (See also Figure S4)
Figure 5. Quantifying neural and behavioral timescales of memory
This figure shows the fitted influence of past outcomes on the activity of lateral habenula and dopamine neurons (A,B) and on behavioral anticipatory eye movements (C) and saccadic reaction times (D). (A) Fitted memory weights (β weights) for the lateral habenula neural population during responses to the rewarded target, unrewarded target, and fixation point (red, blue, and black). The memory weights are normalized so that β
1 = 1 (Methods). Solid dots are memory weights from a fit in which all weights were allowed to vary independently (like those shown in Figure 4). Colored lines are a fit in which the weights were constrained to follow an exponential decay (Methods). This analysis was done on neural activity within the time windows indicated by the gray bars below the axes in Figure 4. Asterisks indicate that the fitted memory decay rate is significantly different from 1.0 (bootstrap test, p < 0.05). (B) Same as (A), but for dopamine neurons. Both lateral habenula and dopamine neurons had long-timescale memories in response to the targets, but short-timescale memories in response to the fixation point. (C) Fitted memory weights for anticipatory behavior, separately for anticipatory fixation (black) and anticipatory bias toward the rewarded target (gray). (D) Fitted memory weights for saccadic reaction times, separately for reactions to the fixation point (black) and targets (gray). (See also Figure S5)
Figure 6. Timescales of memory in tonic neural activity
This figure shows the effect of a single past outcome on tonic neural activity during the inter-trial interval and pre-target period, for two example neurons (A,B) and quantified for all lateral habenula and dopamine neurons (C,E). Also shown is the fitted influence of multiple past outcomes on tonic activity (D,F). (A) Activity of an example lateral habenula neuron on rewarded (red) and unrewarded (blue) trials. The activity is shown for the response to the target ( Past-trial target ), and then is followed into the next trial. Tonic activity was analyzed during the inter-trial interval ( ITI, yellow 700 ms window before fixation point onset) and the pre-target period ( Pre-target, yellow 700 ms window before target onset). Numbers indicate the neuron s ROC area for discriminating the past reward outcome. Colors indicate significance (
p < 0.05, Wilcoxon rank-sum test). (B) Same as (A), for a dopamine neuron. (C) Histogram of lateral habenula neuron ROC areas for the inter-trial interval and pre-target period. Numbers indicate the percentage of neurons with significantly higher activity on past-rewarded trials (red) or past-unrewarded trials (blue). (D) Timescale of neural memory for the inter-trial interval (black) and pre-target period (gray). Conventions as in Figure 5. (E–F) same as (C–D), for dopamine neurons. Memory effects during the pre-target period were not strong enough to estimate the timescale of memory. (See also Figure S6)
Figure 7. Time-varying changes in the timescale of memory
This figure quantifies the timescale of memory found in neural activity and behavior, separately for each lateral habenula and dopamine neuron response ( LHb, DA ) and for behavioral anticipatory eye movements and reaction times. Each data point for neural activity represents the fitted decay rate
D for one of the curves shown in Figure 5A,B or 6D,F. The decay rates for behavioral anticipatory eye movements and reaction times are from Figure 5C,D. Far right: optimal timescale of memory (from Figure 1C). Asterisks indicate significant differences in the fitted decay rates ( p < 0.05, bootstrap test; Methods). Non-significant differences are shown as written p-values. Error bars are 80% bootstrap confidence intervals. (See also Figure S7 and Supplemental Table 1)
All figures (7)
Negative reward signals from the lateral habenula to dopamine neurons are mediated by rostromedial tegmental nucleus in primates.
Version 2. J Neurosci. 2011 Aug 10;31(32):11457-71. doi: 10.1523/JNEUROSCI.1384-11.2011.
J Neurosci. 2011.
21832176 Free PMC article.
Lateral habenula as a source of negative reward signals in dopamine neurons.
Nature. 2007 Jun 28;447(7148):1111-5. doi: 10.1038/nature05860. Epub 2007 May 23.
A pallidus-habenula-dopamine pathway signals inferred stimulus values.
J Neurophysiol. 2010 Aug;104(2):1068-76. doi: 10.1152/jn.00158.2010. Epub 2010 Jun 10.
J Neurophysiol. 2010.
20538770 Free PMC article.
[Role of the lateral habenula and dopamine neurons in reward processing].
Brain Nerve. 2009 Apr;61(4):389-96.
Brain Nerve. 2009.
[Reward, homeostasis and the habenula].
Sheng Li Ke Xue Jin Zhan. 2008 Oct;39(4):292-6.
Sheng Li Ke Xue Jin Zhan. 2008.
Stress transforms lateral habenula reward responses into punishment signals.
Proc Natl Acad Sci U S A. 2019 Jun 18;116(25):12488-12493. doi: 10.1073/pnas.1903334116. Epub 2019 May 31.
Proc Natl Acad Sci U S A. 2019.
31152135 Free PMC article.
Deviation from the matching law reflects an optimal strategy involving learning over multiple timescales.
Nat Commun. 2019 Apr 1;10(1):1466. doi: 10.1038/s41467-019-09388-3.
Nat Commun. 2019.
30931937 Free PMC article.
Social Hierarchy Representation in the Primate Amygdala Reflects the Emotional Ambiguity of Our Social Interactions.
J Exp Neurosci. 2018 Jun 17;12:1179069518782459. doi: 10.1177/1179069518782459. eCollection 2018.
J Exp Neurosci. 2018.
29977115 Free PMC article.
Striatal Vulnerability in Huntington's Disease: Neuroprotection Versus Neurotoxicity.
Brain Sci. 2017 Jun 7;7(6):63. doi: 10.3390/brainsci7060063.
Brain Sci. 2017.
28590448 Free PMC article.
Research Support, N.I.H., Intramural
Research Support, Non-U.S. Gov't
Photic Stimulation / methods
Psychomotor Performance / physiology
LinkOut - more resources
Full Text Sources Other Literature Sources Medical Miscellaneous