Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2010 Aug 12;67(3):499-510.
doi: 10.1016/j.neuron.2010.06.031.

Multiple Timescales of Memory in Lateral Habenula and Dopamine Neurons

Affiliations
Free PMC article
Comparative Study

Multiple Timescales of Memory in Lateral Habenula and Dopamine Neurons

Ethan S Bromberg-Martin et al. Neuron. .
Free PMC article

Abstract

Midbrain dopamine neurons are thought to signal predictions about future rewards based on the memory of past rewarding experience. Little is known about the source of their reward memory and the factors that control its timescale. Here we recorded from dopamine neurons, as well as one of their sources of input, the lateral habenula, while animals predicted upcoming rewards based on the past reward history. We found that lateral habenula and dopamine neurons accessed two distinct reward memories: a short-timescale memory expressed at the start of the task and a near-optimal long-timescale memory expressed when a future reward outcome was revealed. The short- and long-timescale memories were expressed in different forms of reward-oriented eye movements. Our data show that the habenula-dopamine pathway contains multiple timescales of memory and provide evidence for their role in motivated behavior.

Figures

Figure 1
Figure 1. Behavioral task
(A) Task diagram. The animal was required to fixate a spot of light, then follow the spot with a saccade when it stepped to the left or right side of the screen. In each block of 24 trials, saccades to one target direction were rewarded, while saccades to the other direction were unrewarded. (B) The task used a pseudorandom reward schedule in which the reward probability could be predicted with high accuracy as a weighted linear combination of past outcomes plus a constant factor. (C) The optimal weights (black dots) for each past reward outcome. The optimal weights were similar when constrained to take the form of an exponential decay (gray line). (D) Plot of true reward probability against predicted reward probability using the optimal exponentially decaying linear weights. Each dot represents one of the fifty possible six-trial reward histories in the pseudorandom schedule. The predicted reward probability was highly correlated with the true reward probability. (See also Figure S1)
Figure 2
Figure 2. Behavioral memory for a single previous outcome
(A) Trace of horizontal eye position during two example rewarded trials, when the past trial was rewarded (Past R, red) or unrewarded (Past U, blue). Gray bars indicate the fixation point and saccade target. Left: eye position aligned at the time of fixation point onset. Right: eye position aligned at target onset. Inset: eye position aligned at target onset, showing a small bias in eye position towards the location of the rewarded target. (B) Measures of behavioral performance, separately for trials when the past trial was rewarded (red) or unrewarded (blue). Target RT bias is the mean difference in reaction time between saccades to unrewarded targets vs. rewarded targets. Bars are 80% bootstrap confidence intervals. Asterisks indicate statistical significance. ** indicates p < 10−4 in combined data, p < 0.05 in monkey L; *** indicates p < 10−4 in combined data, p < 0.05 in monkey L, p < 0.05 in monkey E; bootstrap test. The memory for past outcomes influenced behavioral performance at all times during the trial. (See also Figure S2)
Figure 3
Figure 3. Neural memory for a single previous outcome
(A) Population average firing rate of lateral habenula neurons ( LHb ) when the past trial was rewarded (red) or unrewarded (blue). Firing rates were smoothed with a Gaussian kernel (σ = 15 ms). Colored bars on the bottom of each plot indicate times when the past trial outcome had a significant effect on neural activity (p < 0.01, paired Wilcoxon signed-rank test). (B) Same as (A), for dopamine neurons ( DA ). Lateral habenula and dopamine neurons had opposite mean response directions and opposite past-outcome effects during all three task events. (C) Schematic illustration of theoretical reward predictions at each time during the trial (see text for full description). When the reward prediction increased (upward arrows, positive prediction errors), lateral habenula neurons were inhibited and dopamine neurons were excited; when the reward prediction decreased (downward arrows, negative prediction errors), lateral habenula neurons were excited and dopamine neurons were inhibited. (See also Figure S3)
Figure 4
Figure 4. Multiple timescales of memory
(A–B) Memory effects in lateral habenula neurons (A) and dopamine neurons (B). Each panel shows the population average past-outcome effects – the difference in firing rate depending on whether a past outcome was rewarded or unrewarded (“Past R – Past U”), derived from the parameters of the fitted model described in the main text. Colored lines are the firing rate differences for specific past outcomes (black, red, orange, yellow = 1-, 2-, 3-, 4-trials-ago outcomes). The analysis was performed in a 151 ms sliding window advanced in 20 ms steps. Dark gray bars at the bottom of the plot indicate times when the population average memory amplitude was significantly different from zero, using the version of the memory model in which the weights followed an exponential decay (p < 0.01, Wilcoxon signed-rank test). Light gray bars below the axes are the time windows used for the analysis in Figure 5. Both lateral habenula and dopamine neurons had one-trial memories in response to the fixation point, but multiple-trial memories in response to the targets. (See also Figure S4)
Figure 5
Figure 5. Quantifying neural and behavioral timescales of memory
This figure shows the fitted influence of past outcomes on the activity of lateral habenula and dopamine neurons (A,B) and on behavioral anticipatory eye movements (C) and saccadic reaction times (D). (A) Fitted memory weights (β weights) for the lateral habenula neural population during responses to the rewarded target, unrewarded target, and fixation point (red, blue, and black). The memory weights are normalized so that β1 = 1 (Methods). Solid dots are memory weights from a fit in which all weights were allowed to vary independently (like those shown in Figure 4). Colored lines are a fit in which the weights were constrained to follow an exponential decay (Methods). This analysis was done on neural activity within the time windows indicated by the gray bars below the axes in Figure 4. Asterisks indicate that the fitted memory decay rate is significantly different from 1.0 (bootstrap test, p < 0.05). (B) Same as (A), but for dopamine neurons. Both lateral habenula and dopamine neurons had long-timescale memories in response to the targets, but short-timescale memories in response to the fixation point. (C) Fitted memory weights for anticipatory behavior, separately for anticipatory fixation (black) and anticipatory bias toward the rewarded target (gray). (D) Fitted memory weights for saccadic reaction times, separately for reactions to the fixation point (black) and targets (gray). (See also Figure S5)
Figure 6
Figure 6. Timescales of memory in tonic neural activity
This figure shows the effect of a single past outcome on tonic neural activity during the inter-trial interval and pre-target period, for two example neurons (A,B) and quantified for all lateral habenula and dopamine neurons (C,E). Also shown is the fitted influence of multiple past outcomes on tonic activity (D,F). (A) Activity of an example lateral habenula neuron on rewarded (red) and unrewarded (blue) trials. The activity is shown for the response to the target ( Past-trial target ), and then is followed into the next trial. Tonic activity was analyzed during the inter-trial interval ( ITI, yellow 700 ms window before fixation point onset) and the pre-target period ( Pre-target, yellow 700 ms window before target onset). Numbers indicate the neuron s ROC area for discriminating the past reward outcome. Colors indicate significance (p < 0.05, Wilcoxon rank-sum test). (B) Same as (A), for a dopamine neuron. (C) Histogram of lateral habenula neuron ROC areas for the inter-trial interval and pre-target period. Numbers indicate the percentage of neurons with significantly higher activity on past-rewarded trials (red) or past-unrewarded trials (blue). (D) Timescale of neural memory for the inter-trial interval (black) and pre-target period (gray). Conventions as in Figure 5. (E–F) same as (C–D), for dopamine neurons. Memory effects during the pre-target period were not strong enough to estimate the timescale of memory. (See also Figure S6)
Figure 7
Figure 7. Time-varying changes in the timescale of memory
This figure quantifies the timescale of memory found in neural activity and behavior, separately for each lateral habenula and dopamine neuron response ( LHb, DA ) and for behavioral anticipatory eye movements and reaction times. Each data point for neural activity represents the fitted decay rate D for one of the curves shown in Figure 5A,B or 6D,F. The decay rates for behavioral anticipatory eye movements and reaction times are from Figure 5C,D. Far right: optimal timescale of memory (from Figure 1C). Asterisks indicate significant differences in the fitted decay rates (p < 0.05, bootstrap test; Methods). Non-significant differences are shown as written p-values. Error bars are 80% bootstrap confidence intervals. (See also Figure S7 and Supplemental Table 1)

Similar articles

See all similar articles

Cited by 32 articles

See all "Cited by" articles

Publication types

Feedback