The anterior cingulate cortex (ACC) is known to play a crucial role in the fast adaptations of behavior based on immediate reward values. What is less certain is whether the ACC is also involved in long-term adaptations to situations with uncertain outcomes. To study this issue, we placed macaque monkeys in a probabilistic context in which the appropriate strategy to maximize reward was to identify the stimulus with the highest reward value (optimal stimulus). Only knowledge of the theoretical average reward value associated with this stimulus--referred to as 'the task value'--was available. Remarkably, in each trial, ACC pre-reward activity correlated with the task value. Importantly, this neuronal activity was observed prior to the discovery of the optimal stimulus. We hypothesize that the received rewards and the task value, constructed a priori through learning, are used to guide behavior and identify the optimal stimulus. We tested this hypothesis by muscimol deactivation of the ACC. As predicted, this inactivation impaired the search for the optimal stimulus. We propose that ACC participates in long-term adaptation of voluntary reward-based behaviors by encoding general task values and received rewards.