Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;7(5):e37372.
doi: 10.1371/journal.pone.0037372. Epub 2012 May 24.

Transferring learning from external to internal weights in echo-state networks with sparse connectivity

Affiliations

Transferring learning from external to internal weights in echo-state networks with sparse connectivity

David Sussillo et al. PLoS One. 2012.

Abstract

Modifying weights within a recurrent network to improve performance on a task has proven to be difficult. Echo-state networks in which modification is restricted to the weights of connections onto network outputs provide an easier alternative, but at the expense of modifying the typically sparse architecture of the network by including feedback from the output back into the network. We derive methods for using the values of the output weights from a trained echo-state network to set recurrent weights within the network. The result of this "transfer of learning" is a recurrent network that performs the task without requiring the output feedback present in the original network. We also discuss a hybrid version in which online learning is applied to both output and recurrent weights. Both approaches provide efficient ways of training recurrent networks to perform complex tasks. Through an analysis of the conditions required to make transfer of learning work, we define the concept of a "self-sensing" network state, and we compare and contrast this with compressed sensing.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. The two recurrent network architectures being considered.
The nets are shown with non-modifiable connections shown in black and modifiable connections in red. Both networks receives input formula image, contain units that interact through a sparse weight matrix formula image, and produce an output formula image, obtained by summing activity from the entire network weighted by the modifiable components of the vector formula image. (A) The output unit sends feedback to all of the network units through connections of fixed weight formula image. Learning affects only the output weights formula image. (B) The same network as in A, but without output feedback. Learning takes place both in the network through the modification formula image, to implement the effect of the feedback loop, and at the output weights formula image, to correctly learn formula image.
Figure 2
Figure 2. Comparison of network simulations and analytical results.
The network simulations (filled circles) and analytic results (solid lines) for sparse (red) and PC (blue) reconstruction errors as a function of formula image for different formula image values. The “error” here is either formula image (red points and curve) or formula image (blue points and curve). The input was formula image with formula image and formula image = 0, 0.4, 0.6 in the three panels, from left to right. The value of formula image was adjusted by changing formula image. Inserts show the PC eigenvalues (blue) and the exponential fits to them (red), using the value of formula image indicated. Logarithms are base 10.
Figure 3
Figure 3. An example input-output task implemented in a network with feedback (A) and then transferred to a network without feedback using
equation 21 . The upper row shows the input to the network, consisting of two pulses separate by less than 1 s (left columns of A and B) or more than 1 s (right columns of A and B). The red traces show the output of the two networks correctly responding only to the input pulses separated by less than 1 s. The blue traces show 5 sample network units. The green traces show formula image in A and formula image in B for the five sample units. The similarity in these traces shows that the transfer was successful at getting the recurrent input in B to approximate well the feedback input in A for each unit.
Figure 4
Figure 4. The distribution of the elements of , for equally spaced values of .
The eigenvectors formula image for a correlation matrix from simulations similar to those in figure 2 used to demonstrate the approximately Gaussian distribution for the elements of formula image. The red distribution in the front is for formula image, and the black distribution in the back is for formula image, with intermediate layers corresponding to intermediate values. The formula image matrix was randomly initialized for each value of formula image.

Similar articles

Cited by

References

    1. Doya K. Bifurcations in the learning of recurrent neural networks. Proceedings of the IEEE International Symposium on Circuits and Systems, ISCAS ’92, vol. 1992;6:2777–2780.
    1. Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks. 1994;5:157–166. - PubMed
    1. Martens J, Sutskever I. Learning recurrent neural networks with hessian-free optimization. Proceedings of the 28th International Conference on Machine Learning. 2011;4 Available: http://www.cs.toronto.edu/~jmartens/docs/RNN_HF.pdf. Accessed 2012 May.
    1. Maass W, Natschläger T, Markram H. Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural Computation. 2002;14:2531–2560. - PubMed
    1. Jaeger H. Adaptive nonlinear system identification with echo state networks. In: Becker S, Thrun S, Obermayer K, editors. Advances in Neural Information Processing Systems 15. Cambridge, MA: MIT Press. 1713 pp; 2003.

Publication types