Humans adaptively perform actions to achieve their goals. This flexible behaviour requires two core abilities: the ability to anticipate the outcomes of candidate actions and the ability to select and implement actions in a goal-directed manner. The ability to predict outcomes has been extensively researched in reinforcement learning paradigms, but this work has often focused on simple actions that are not embedded in hierarchical and sequential structures that are characteristic of goal-directed human behaviour. On the other hand, the ability to select actions in accordance with high-level task goals, particularly in the presence of alternative responses and salient distractors, has been widely researched in cognitive control paradigms. Cognitive control research, however, has often paid less attention to the role of action outcomes. The present review attempts to bridge these accounts by proposing an outcome-guided mechanism for selection of extended actions. Our proposal builds on constructs from the hierarchical reinforcement learning literature, which emphasises the concept of reaching and evaluating informative states, i.e., states that constitute subgoals in complex actions. We develop an account of the neural mechanisms that allow outcome-guided action selection to be achieved in a network that relies on projections from cortical areas to the basal ganglia and back-projections from the basal ganglia to the cortex. These cortico-basal ganglia-thalamo-cortical 'loops' allow convergence - and thus integration - of information from non-adjacent cortical areas (for example between sensory and motor representations). This integration is essential in action sequences, for which achieving an anticipated sensory state signals the successful completion of an action. We further describe how projection pathways within the basal ganglia allow selection between representations, which may pertain to movements, actions, or extended action plans. The model lastly envisages a role for hierarchical projections from the striatum to dopaminergic midbrain areas that enable more rostral frontal areas to bias the selection of inputs from more posterior frontal areas via their respective representations in the basal ganglia.
Keywords: Action selection; Basal ganglia; Cognitive control; Dopamine; Hierarchical reinforcement learning; Ideomotor principle; Prediction; Prefrontal cortex; Reinforcement learning; Striatum.
Copyright © 2015. Published by Elsevier Ltd.