Contextual influence on confidence judgments in human reinforcement learning

PLoS Comput Biol. 2019 Apr 8;15(4):e1006973. doi: 10.1371/journal.pcbi.1006973. eCollection 2019 Apr.

Abstract

The ability to correctly estimate the probability of one's choices being correct is fundamental to optimally re-evaluate previous choices or to arbitrate between different decision strategies. Experimental evidence nonetheless suggests that this metacognitive process-confidence judgment- is susceptible to numerous biases. Here, we investigate the effect of outcome valence (gains or losses) on confidence while participants learned stimulus-outcome associations by trial-and-error. In two experiments, participants were more confident in their choices when learning to seek gains compared to avoiding losses, despite equal difficulty and performance between those two contexts. Computational modelling revealed that this bias is driven by the context-value, a dynamically updated estimate of the average expected-value of choice options, necessary to explain equal performance in the gain and loss domain. The biasing effect of context-value on confidence, revealed here for the first time in a reinforcement-learning context, is therefore domain-general, with likely important functional consequences. We show that one such consequence emerges in volatile environments, where the (in)flexibility of individuals' learning strategies differs when outcomes are framed as gains or losses. Despite apparent similar behavior- profound asymmetries might therefore exist between learning to avoid losses and learning to seek gains.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Choice Behavior / ethics*
  • Choice Behavior / physiology
  • Decision Making / ethics*
  • Decision Making / physiology
  • Female
  • Humans
  • Judgment / ethics*
  • Judgment / physiology
  • Learning
  • Male
  • Reinforcement, Psychology
  • Self Concept
  • Young Adult

Associated data

  • figshare/10.6084/m9.figshare.7851767

Grants and funding

This work was supported by startup funds from the Amsterdam School of Economics, awarded to JBE. JBE and ML gratefully acknowledge support from Amsterdam Brain and Cognition (ABC). ML is supported by an NWO Veni Fellowship (Grant 451-15-015), a Swiss National Fund Ambizione grant (PZ00P3_174127) and the Fondation Bettencourt Schueller. SP is supported by an ATIP-Avenir grant (R16069JS), the Programme Emergence(s) de la Ville de Paris, the Fondation Fyssen and Fondation Schlumberger pour l’Education et la Recherche. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.