Reinforcement learning via kernel temporal difference

Annu Int Conf IEEE Eng Med Biol Soc. 2011:2011:5662-5. doi: 10.1109/IEMBS.2011.6091370.

Abstract

This paper introduces a kernel adaptive filter implemented with stochastic gradient on temporal differences, kernel Temporal Difference (TD)(λ), to estimate the state-action value function in reinforcement learning. The case λ=0 will be studied in this paper. Experimental results show the method's applicability for learning motor state decoding during a center-out reaching task performed by a monkey. The results are compared to the implementation of a time delay neural network (TDNN) trained with backpropagation of the temporal difference error. From the experiments, it is observed that kernel TD(0) allows faster convergence and a better solution than the neural network.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Biomimetics / methods
  • Brain / physiology*
  • Electroencephalography / methods*
  • Humans
  • Pattern Recognition, Automated / methods*
  • Reinforcement, Psychology*
  • User-Computer Interface*