In order to investigate roles of dopamine receptor subtypes in reward-based learning, we examined choice behavior of dopamine D1 and D2 receptor-knockout (D1R-KO and D2R-KO, respectively) mice in an instrumental learning task with progressively increasing reversal frequency and a dynamic two-armed bandit task. Performance of D2R-KO mice was progressively impaired in the former as the frequency of reversal increased and profoundly impaired in the latter even with prolonged training, whereas D1R-KO mice showed relatively minor performance deficits. Choice behavior in the dynamic two-armed bandit task was well explained by a hybrid model including win-stay-lose-switch and reinforcement learning terms. A model-based analysis revealed increased win-stay, but impaired value updating and decreased value-dependent action selection in D2R-KO mice, which were detrimental to maximizing rewards in the dynamic two-armed bandit task. These results suggest an important role of dopamine D2 receptors in learning from past choice outcomes for rapid adjustment of choice behavior in a dynamic and uncertain environment.
Keywords: D1 receptor; D2 receptor; dynamic foraging task; mouse; reinforcement learning; reversal.