Parallel Representation of Value-Based and Finite State-Based Strategies in the Ventral and Dorsal Striatum
- PMID: 26529522
- PMCID: PMC4631489
- DOI: 10.1371/journal.pcbi.1004540
Parallel Representation of Value-Based and Finite State-Based Strategies in the Ventral and Dorsal Striatum
Abstract
Previous theoretical studies of animal and human behavioral learning have focused on the dichotomy of the value-based strategy using action value functions to predict rewards and the model-based strategy using internal models to predict environmental states. However, animals and humans often take simple procedural behaviors, such as the "win-stay, lose-switch" strategy without explicit prediction of rewards or states. Here we consider another strategy, the finite state-based strategy, in which a subject selects an action depending on its discrete internal state and updates the state depending on the action chosen and the reward outcome. By analyzing choice behavior of rats in a free-choice task, we found that the finite state-based strategy fitted their behavioral choices more accurately than value-based and model-based strategies did. When fitted models were run autonomously with the same task, only the finite state-based strategy could reproduce the key feature of choice sequences. Analyses of neural activity recorded from the dorsolateral striatum (DLS), the dorsomedial striatum (DMS), and the ventral striatum (VS) identified significant fractions of neurons in all three subareas for which activities were correlated with individual states of the finite state-based strategy. The signal of internal states at the time of choice was found in DMS, and for clusters of states was found in VS. In addition, action values and state values of the value-based strategy were encoded in DMS and VS, respectively. These results suggest that both the value-based strategy and the finite state-based strategy are implemented in the striatum.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
Similar articles
-
Lesions of dorsal striatum eliminate lose-switch responding but not mixed-response strategies in rats.Eur J Neurosci. 2014 May;39(10):1655-63. doi: 10.1111/ejn.12518. Epub 2014 Mar 6. Eur J Neurosci. 2014. PMID: 24602013
-
Distinct neural representation in the dorsolateral, dorsomedial, and ventral parts of the striatum during fixed- and free-choice tasks.J Neurosci. 2015 Feb 25;35(8):3499-514. doi: 10.1523/JNEUROSCI.1962-14.2015. J Neurosci. 2015. PMID: 25716849 Free PMC article.
-
Neuronal basis for evaluating selected action in the primate striatum.Eur J Neurosci. 2011 Aug;34(3):489-506. doi: 10.1111/j.1460-9568.2011.07771.x. Epub 2011 Jul 22. Eur J Neurosci. 2011. PMID: 21781189
-
Corticostriatal circuit mechanisms of value-based action selection: Implementation of reinforcement learning algorithms and beyond.Behav Brain Res. 2016 Sep 15;311:110-121. doi: 10.1016/j.bbr.2016.05.017. Epub 2016 May 9. Behav Brain Res. 2016. PMID: 27173430 Review.
-
Parallel associative processing in the dorsal striatum: segregation of stimulus-response and cognitive control subregions.Neurobiol Learn Mem. 2011 Sep;96(2):95-120. doi: 10.1016/j.nlm.2011.06.002. Epub 2011 Jun 16. Neurobiol Learn Mem. 2011. PMID: 21704718 Review.
Cited by
-
Choice-selective sequences dominate in cortical relative to thalamic inputs to NAc to support reinforcement learning.Cell Rep. 2022 May 17;39(7):110756. doi: 10.1016/j.celrep.2022.110756. Cell Rep. 2022. PMID: 35584665 Free PMC article.
-
Reward-Predictive Neural Activities in Striatal Striosome Compartments.eNeuro. 2018 Feb 5;5(1):ENEURO.0367-17.2018. doi: 10.1523/ENEURO.0367-17.2018. eCollection 2018 Jan-Feb. eNeuro. 2018. PMID: 29430520 Free PMC article.
-
Distinct basal ganglia contributions to learning from implicit and explicit value signals in perceptual decision-making.Nat Commun. 2024 Jun 22;15(1):5317. doi: 10.1038/s41467-024-49538-w. Nat Commun. 2024. PMID: 38909014 Free PMC article.
-
Striatal action-value neurons reconsidered.Elife. 2018 May 31;7:e34248. doi: 10.7554/eLife.34248. Elife. 2018. PMID: 29848442 Free PMC article.
-
A Genetically Defined Compartmentalized Striatal Direct Pathway for Negative Reinforcement.Cell. 2020 Oct 1;183(1):211-227.e20. doi: 10.1016/j.cell.2020.08.032. Epub 2020 Sep 15. Cell. 2020. PMID: 32937106 Free PMC article.
References
-
- Doya K (1999) What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Netw 12: 961–974. - PubMed
-
- Daw ND, Niv Y, Dayan P (2005) Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci 8: 1704–1711. - PubMed
-
- Watkins CJCH, Dayan P (1992) Q-learning. Machine Learning 8: 279–292.
-
- Samejima K, Ueda Y, Doya K, Kimura M (2005) Representation of action-specific reward values in the striatum. Science 310: 1337–1340. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous
