Reinforcement learning and human behavior
- PMID: 24709606
- DOI: 10.1016/j.conb.2013.12.004
Reinforcement learning and human behavior
Abstract
The dominant computational approach to model operant learning and its underlying neural activity is model-free reinforcement learning (RL). However, there is accumulating behavioral and neuronal-related evidence that human (and animal) operant learning is far more multifaceted. Theoretical advances in RL, such as hierarchical and model-based RL extend the explanatory power of RL to account for some of these findings. Nevertheless, some other aspects of human behavior remain inexplicable even in the simplest tasks. Here we review developments and remaining challenges in relating RL models to human operant learning. In particular, we emphasize that learning a model of the world is an essential step before or in parallel to learning the policy in RL and discuss alternative models that directly learn a policy without an explicit world model in terms of state-action pairs.
Copyright © 2013 Elsevier Ltd. All rights reserved.
Similar articles
-
Feeding behavior of Aplysia: a model system for comparing cellular mechanisms of classical and operant conditioning.Learn Mem. 2006 Nov-Dec;13(6):669-80. doi: 10.1101/lm.339206. Learn Mem. 2006. PMID: 17142299 Review.
-
The role of first impression in operant learning.J Exp Psychol Gen. 2013 May;142(2):476-88. doi: 10.1037/a0029550. Epub 2012 Aug 27. J Exp Psychol Gen. 2013. PMID: 22924882 Clinical Trial.
-
[Individual behavior depending on reinforcement prediction errors and environmental uncertainty].Zh Vyssh Nerv Deiat Im I P Pavlova. 2008 Jul-Aug;58(4):408-22. Zh Vyssh Nerv Deiat Im I P Pavlova. 2008. PMID: 18825939 Review. Russian.
-
Dopamine, reinforcement learning, and addiction.Pharmacopsychiatry. 2009 May;42 Suppl 1:S56-65. doi: 10.1055/s-0028-1124107. Epub 2009 May 11. Pharmacopsychiatry. 2009. PMID: 19434556 Review.
-
Learning obstacle avoidance with an operant behavior model.Artif Life. 2004 Winter;10(1):65-81. doi: 10.1162/106454604322875913. Artif Life. 2004. PMID: 15035863
Cited by
-
Humans forage for reward in reinforcement learning tasks.bioRxiv [Preprint]. 2024 Jul 8:2024.07.08.602539. doi: 10.1101/2024.07.08.602539. bioRxiv. 2024. PMID: 39026817 Free PMC article. Preprint.
-
Spatial generalization in operant learning: lessons from professional basketball.PLoS Comput Biol. 2014 May 22;10(5):e1003623. doi: 10.1371/journal.pcbi.1003623. eCollection 2014 May. PLoS Comput Biol. 2014. PMID: 24853373 Free PMC article.
-
Emotions as computations.Neurosci Biobehav Rev. 2023 Jan;144:104977. doi: 10.1016/j.neubiorev.2022.104977. Epub 2022 Nov 24. Neurosci Biobehav Rev. 2023. PMID: 36435390 Free PMC article. Review.
-
Dynamics of sensory and decisional biases in perceptual decision making: Insights from the face distortion illusion.Psychon Bull Rev. 2024 Jul 9. doi: 10.3758/s13423-024-02539-8. Online ahead of print. Psychon Bull Rev. 2024. PMID: 38980570
-
Data Management and Modeling in Plant Biology.Front Plant Sci. 2021 Sep 3;12:717958. doi: 10.3389/fpls.2021.717958. eCollection 2021. Front Plant Sci. 2021. PMID: 34539712 Free PMC article. Review.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
