Individuals differ in how they learn from experience. In Pavlovian conditioning models, where cues predict reinforcer delivery at a different goal location, some animals-called sign-trackers-come to approach the cue, whereas others, called goal-trackers, approach the goal. In sign-trackers, model-free phasic dopaminergic reward-prediction errors underlie learning, which renders stimuli 'wanted'. Goal-trackers do not rely on dopamine for learning and are thought to use model-based learning. We demonstrate this double dissociation in 129 male humans using eye-tracking, pupillometry and functional magnetic resonance imaging informed by computational models of sign- and goal-tracking. We show that sign-trackers exhibit a neural reward prediction error signal that is not detectable in goal-trackers. Model-free value only guides gaze and pupil dilation in sign-trackers. Goal-trackers instead exhibit a stronger model-based neural state prediction error signal. This model-based construct determines gaze and pupil dilation more in goal-trackers.