Extended training can induce a shift in behavioral control from goal-directed actions, which are governed by action-outcome contingencies and sensitive to changes in the expected value of the outcome, to habits which are less dependent on action-outcome relations and insensitive to changes in outcome value. Previous studies in rats have shown that interval schedules of reinforcement favor habit formation while ratio schedules favor goal-directed behavior. However, the molecular mechanisms underlying habit formation are not well understood. Endocannabinoids, which can function as retrograde messengers acting through presynaptic CB1 receptors, are highly expressed in the dorsolateral striatum, a key region involved in habit formation. Using a reversible devaluation paradigm, we confirmed that in mice random interval schedules also favor habit formation compared with random ratio schedules. We also found that training with interval schedules resulted in a preference for exploration of a novel lever, whereas training with ratio schedules resulted in less generalization and more exploitation of the reinforced lever. Furthermore, mice carrying either a heterozygous or a homozygous null mutation of the cannabinoid receptor type I (CB1) showed reduced habit formation and enhanced exploitation. The impaired habit formation in CB1 mutant mice cannot be attributed to chronic developmental or behavioral abnormalities because pharmacological blockade of CB1 receptors specifically during training also impairs habit formation. Taken together our data suggest that endocannabinoid signaling is critical for habit formation.
Keywords: action; decision-making; dopamine; exploration; habit; plasticity; reward.