The medial prefrontal cortex (mPFC) plays a major role in goal-directed behaviours, but it is unclear whether it plays a role in breaking away from a high-value reward in order to explore for better options. To address this question, we designed a novel 3-arm Bandit Task in which rats were required to choose one of three potential reward arms, each of which was associated with a different amount of food reward and time-out punishment. After a variable number of choice trials the reward locations were shuffled and animals had to disengage from the now devalued arm and explore the other options in order to optimise payout. Lesion and control groups' behaviours on the task were then analysed by fitting data with a reinforcement learning model. As expected, lesioned animals obtained less reward overall due to an inability to flexibly adapt their behaviours after a change in reward location. However, modelling results showed that lesioned animals were no more likely to explore than control animals. We also discovered that all animals showed a strong preference for certain maze arms, at the expense of reward. This tendency was exacerbated in the lesioned animals, with the strongest effects seen in a subset of animals with damage to dorsal mPFC. The results confirm a role for mPFC in goal-directed behaviours but suggest that rats rely on other areas to resolve the explore-exploit dilemma.
Keywords: Decision making; Exploration; Prefrontal cortex; Reinforcement learning; Reward; Value.
Crown Copyright © 2016. Published by Elsevier B.V. All rights reserved.