Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets

Juan Antonio Pérez-Ortiz; Felix A Gers; Douglas Eck; Jürgen Schmidhuber

doi:10.1016/S0893-6080(02)00219-8

Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets

Neural Netw. 2003 Mar;16(2):241-50. doi: 10.1016/S0893-6080(02)00219-8.

Authors

Juan Antonio Pérez-Ortiz¹, Felix A Gers, Douglas Eck, Jürgen Schmidhuber

Affiliation

¹ Departament de Llenguatges i Sistemes Informàtics, Universitat d'Alacant, E-03071 Alacant, Spain. japerez@dlsi.ua.es

PMID: 12628609
DOI: 10.1016/S0893-6080(02)00219-8

Abstract

The long short-term memory (LSTM) network trained by gradient descent solves difficult problems which traditional recurrent neural networks in general cannot. We have recently observed that the decoupled extended Kalman filter training algorithm allows for even better performance, reducing significantly the number of training steps when compared to the original gradient descent training algorithm. In this paper we present a set of experiments which are unsolvable by classical recurrent networks but which are solved elegantly and robustly and quickly by LSTM combined with Kalman filters.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Memory, Short-Term / physiology*
Neural Networks, Computer*