Evaluating outcomes of behavior is a central function of the striatum. In circuits engaging the dorsomedial striatum, sensitivity to goal value is accentuated during learning, whereas outcome sensitivity is thought to be minimal in the dorsolateral striatum and its habit-related corticostriatal circuits. However, a distinct population of projection neurons in the dorsolateral striatum exhibits selective sensitivity to rewards. Here, we evaluated the outcome-related signaling in such neurons as rats performed an instructional T-maze task for two rewards. As the rats formed maze-running habits and then changed behavior after reward devaluation, we detected outcome-related spike activity in 116 units out of 1,479 recorded units. During initial training, nearly equal numbers of these units fired preferentially either after rewarded runs or after unrewarded runs, and the majority were responsive at only one of two reward locations. With overtraining, as habits formed, firing in nonrewarded trials almost disappeared, and reward-specific firing declined. Thus error-related signaling was lost, and reward signaling became generalized. Following reward devaluation, in an extinction test, postgoal activity was nearly undetectable, despite accurate running. Strikingly, when rewards were then returned, postgoal activity reappeared and recapitulated the original early response pattern, with nearly equal numbers responding to rewarded and unrewarded runs and to single rewards. These findings demonstrate that outcome evaluation in the dorsolateral striatum is highly plastic and tracks stages of behavioral exploration and exploitation. These signals could be a new target for understanding compulsive behaviors that involve changes to dorsal striatum function.
Keywords: basal ganglia; electrophysiology; learning; reinforcement.
Copyright © 2016 the American Physiological Society.