Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 May 1;121(5):1748-1760.
doi: 10.1152/jn.00817.2018. Epub 2019 Mar 13.

Adapting the flow of time with dopamine

Affiliations

Adapting the flow of time with dopamine

John G Mikhael et al. J Neurophysiol. .

Abstract

The modulation of interval timing by dopamine (DA) has been well established over decades of research. The nature of this modulation, however, has remained controversial: Although the pharmacological evidence has largely suggested that time intervals are overestimated with higher DA levels, more recent optogenetic work has shown the opposite effect. In addition, a large body of work has asserted DA's role as a "reward prediction error" (RPE), or a teaching signal that allows the basal ganglia to learn to predict future rewards in reinforcement learning tasks. Whether these two seemingly disparate accounts of DA may be related has remained an open question. By taking a reinforcement learning-based approach to interval timing, we show here that the RPE interpretation of DA naturally extends to its role as a modulator of timekeeping and furthermore that this view reconciles the seemingly conflicting observations. We derive a biologically plausible, DA-dependent plasticity rule that can modulate the rate of timekeeping in either direction and whose effect depends on the timing of the DA signal itself. This bidirectional update rule can account for the results from pharmacology and optogenetics as well as the behavioral effects of reward rate on interval timing and the temporal selectivity of striatal neurons. Hence, by adopting a single RPE interpretation of DA, our results take a step toward unifying computational theories of reinforcement learning and interval timing. NEW & NOTEWORTHY How does dopamine (DA) influence interval timing? A large body of pharmacological evidence has suggested that DA accelerates timekeeping mechanisms. However, recent optogenetic work has shown exactly the opposite effect. In this article, we relate DA's role in timekeeping to its most established role, as a critical component of reinforcement learning. This allows us to derive a neurobiologically plausible framework that reconciles a large body of DA's temporal effects, including pharmacological, behavioral, electrophysiological, and optogenetic.

Keywords: dopamine; interval timing; reinforcement learning; reward prediction error.

PubMed Disclaimer

Conflict of interest statement

No conflicts of interest, financial or otherwise, are declared by the authors.

Figures

Fig. 1.
Fig. 1.
Model architecture. The top layer denotes a compressive representation t′ of objective time t. t′ maps onto subjective time τ, weighted by the scaling factor η. Each time cell d is preferentially tuned to a time μd and responds with activation xd. The sum of the features xd, weighted by wd, gives estimated value τ.
Fig. 2.
Fig. 2.
Effect of scaling η on activation of time cells. When measured against objective time t, higher η leads to more compressed time cell activations. Here, the same η for all features is learned. By Eq. 5, it is straightforward to show that all features move in tandem with a constant coefficient of variation, and no crossovers are produced during rescaling (appendix).
Fig. 3.
Fig. 3.
Effect of state uncertainty on value estimation and reward prediction error (RPE). With a perfectly learned value function (top, gray), δτ reduces to zero throughout the entire trial duration (bottom, gray). With state uncertainty, implemented by an overlapping feature set, value cannot be estimated perfectly (top, black), and δτ is nonzero even after extensive learning (bottom, black). (Figure for illustration only. For smaller T or larger γ, the initial phasic RPE can be larger than RPE at reward time, illustrated here at t = 20 and t = 75, respectively.)
Fig. 4.
Fig. 4.
Bidirectional learning rule for η. Because of feature overlap, the learned value (black curve, plotted here against objective time) does not drop immediately to zero after reward time T (dashed vertical line). This gradual decrease allows its derivative V^˙ to be negative over a nonzero domain of time, which in turn allows the update rule for η in Eq. 6 to take negative values over that same domain. It follows that η increases if reward is delivered roughly before T and decreases if reward is delayed past T (gray curve).
Fig. 5.
Fig. 5.
Results from behavioral experiments compared with model behavior. A: Killeen and Fetterman (1988) have found that the speed of the pacemaker is directly proportional to the rate of reinforcement. Reprinted from Killeen and Fetterman (1988). B: Morgan et al. (1993) have found that when exposed to high or low rates of reinforcement and returned to a baseline condition, pigeons’ behaviors were consistent with a faster or slower pacemaker, respectively. Reprinted from Morgan et al. (1993) with permission from Elsevier. C and D: model behavior, recapitulating observations in A and B, respectively. See methods for simulation details.
Fig. 6.
Fig. 6.
Results from electrophysiology compared with model behavior. A: in Mello et al. (2015), electrophysiological recordings in rat striatum during a timekeeping task identified medium spiny neurons that fired sequentially during the delay period and whose response profiles rescaled to reflect the timed duration [fixed interval (FI); note scaling of x-axis] but maintained their relative ordering. In addition, by visual inspection, the gradual increase in response profile widths across cells within each trial seems similar across different task durations, as would be suggested by the scalar property. Reprinted from Mello et al. (2015) with permission from Elsevier. B: our model recapitulates both phenomena. See methods for simulation details.
Fig. 7.
Fig. 7.
Results from pharmacology compared with model behavior. A: in Lake and Meck (2013), subjects reproduced previously learned 7-s and 17-s intervals after administration of the dopamine (DA) agonist amphetamine (AMP), the DA antagonist haloperidol (HAL), or placebo. In the majority of subjects, amphetamine reduced response time, whereas haloperidol delayed it. Reprinted from Lake and Meck (2013) with permission from Elsevier. B: our model recapitulates these effects. See methods for simulation details.
Fig. 8.
Fig. 8.
Results from optogenetic stimulation of midbrain dopamine neurons compared with model behavior. A and B: in Soares et al. (2016), mice were trained on a temporal discrimination task in which they had to judge intervals as either shorter or longer than 1.5 s, and psychometric functions were fit to the data (black curves in both panels). A: under optogenetic activation spanning the entire trial, the psychometric function shifted to the right (dark gray curve), consistent with a slower pacemaker. Insets show the average difference between the probability of selecting the long choice during activation trials vs. control trials per animal (top left) or per stimulus (bottom right). B: under optogenetic inhibition, the psychometric function shifted to the left (light gray curve), consistent with a faster pacemaker. Insets: same as in A, but for inhibition. A and B from Soares et al. Science 354: 1273–1277, 2016. Reprinted with permission from AAAS. C and D: our model recapitulates these effects. See methods for simulation details.
Fig. 9.
Fig. 9.
Results from optogenetic stimulation of basal ganglia output compared with model behavior. A, top: in Toda et al. (2017), mice were trained on a peak-interval licking task, and peak licking robustly reflected reward time. The nigrotectal pathway was optogenetically stimulated immediately after reward (left), immediately before reward (center), or 1 s before reward (right). Bottom: stimulation resulted in a shift in peak licking on the subsequent trial. The peak time occurred later when stimulation was delivered immediately after or immediately before reward (left and center), and it occurred earlier when stimulation was completed 1 s before reward (right). *P < 0.05. Reprinted from Toda et al. (2017) with permission from Elsevier. B: our model recapitulates these effects. See methods for simulation details.
Fig. 10.
Fig. 10.
Simulated effect of tonic dopamine (DA) on value and reward prediction error (RPE) against objective time. A: in our model, higher-DA conditions lead to a faster pacemaker and more compressed features, which serve as states. By Eq. 4, this leads to a steeper slope in the value function when measured against objective time. B: DA response, computed by Eq. 4, and based on the corresponding value functions in A. Left and right dashed lines denote conditioned stimulus and reward time, respectively.

Similar articles

Cited by

References

    1. Albin RL, Young AB, Penney JB. The functional anatomy of basal ganglia disorders. Trends Neurosci 12: 366–375, 1989. doi:10.1016/0166-2236(89)90074-X. - DOI - PubMed
    1. Allman MJ, Teki S, Griffiths TD, Meck WH. Properties of the internal clock: first- and second-order principles of subjective time. Annu Rev Psychol 65: 743–771, 2014. doi:10.1146/annurev-psych-010213-115117. - DOI - PubMed
    1. Artieda J, Pastor MA, Lacruz F, Obeso JA. Temporal discrimination is abnormal in Parkinson’s disease. Brain 115: 199–210, 1992. doi:10.1093/brain/115.1.199. - DOI - PubMed
    1. Arushanyan E, Baida O, Mastyagin S, Popova A, Shikina I. Influence of caffeine on the subjective perception of time by healthy subjects in dependence on various factors. Hum Physiol 29: 433–436, 2003. doi:10.1023/A:1024973305920. - DOI - PubMed
    1. Balcı F. Interval timing, dopamine, and motivation. Timing Time Percept 2: 379–410, 2014. doi:10.1163/22134468-00002035. - DOI

Publication types

LinkOut - more resources