Quantifying exploration in reward-based motor learning

PLoS One. 2020 Apr 2;15(4):e0226789. doi: 10.1371/journal.pone.0226789. eCollection 2020.


Exploration in reward-based motor learning is observable in experimental data as increased variability. In order to quantify exploration, we compare three methods for estimating other sources of variability: sensorimotor noise. We use a task in which participants could receive stochastic binary reward feedback following a target-directed weight shift. Participants first performed six baseline blocks without feedback, and next twenty blocks alternating with and without feedback. Variability was assessed based on trial-to-trial changes in movement endpoint. We estimated sensorimotor noise by the median squared trial-to-trial change in movement endpoint for trials in which no exploration is expected. We identified three types of such trials: trials in baseline blocks, trials in the blocks without feedback, and rewarded trials in the blocks with feedback. We estimated exploration by the median squared trial-to-trial change following non-rewarded trials minus sensorimotor noise. As expected, variability was larger following non-rewarded trials than following rewarded trials. This indicates that our reward-based weight-shifting task successfully induced exploration. Most importantly, our three estimates of sensorimotor noise differed: the estimate based on rewarded trials was significantly lower than the estimates based on the two types of trials without feedback. Consequently, the estimates of exploration also differed. We conclude that the quantification of exploration depends critically on the type of trials used to estimate sensorimotor noise. We recommend the use of variability following rewarded trials.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Behavior / physiology
  • Biofeedback, Psychology
  • Female
  • Humans
  • Learning / physiology*
  • Male
  • Middle Aged
  • Motor Activity / physiology*
  • Musculoskeletal Physiological Phenomena
  • Psychomotor Performance / physiology*
  • Reaction Time / physiology
  • Research Design
  • Reward
  • Statistical Distributions
  • Young Adult

Grants and funding

The research was funded by the Nederlandse Organisatie voor Wetenschappelijk Onderzoek, Toegepaste en Technische Wetenschappen (NWO-TTW), by the Open Technologie Programma (OTP) grant 15989 awarded to Jeroen Smeets.