Training memristor-based multilayer neuromorphic networks with SGD, momentum and adaptive learning rates

Zheng Yan; Jiadong Chen; Rui Hu; Tingwen Huang; Yiran Chen; Shiping Wen

doi:10.1016/j.neunet.2020.04.025

Training memristor-based multilayer neuromorphic networks with SGD, momentum and adaptive learning rates

Neural Netw. 2020 Aug:128:142-149. doi: 10.1016/j.neunet.2020.04.025. Epub 2020 May 7.

Authors

Zheng Yan¹, Jiadong Chen², Rui Hu¹, Tingwen Huang³, Yiran Chen⁴, Shiping Wen⁵

Affiliations

¹ School of Automation and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan, 430074, China.
² School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.
³ Science Program, Texas A & M University at Qatar, Doha 23874, Qatar.
⁴ Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708, USA.
⁵ Centre for Artificial Intelligence, University of Technology Sydney, Ultimo, NSW 2007, Australia. Electronic address: shiping.wen@uts.edu.au.

PMID: 32446191
DOI: 10.1016/j.neunet.2020.04.025

Abstract

Neural networks implemented with traditional hardware face inherent limitation of memory latency. Specifically, the processing units like GPUs, FPGAs, and customized ASICs, must wait for inputs to read from memory and outputs to write back. This motivates memristor-based neuromorphic computing in which the memory units (i.e., memristors) have computing capabilities. However, training a memristor-based neural network is difficult since memristors work differently from CMOS hardware. This paper proposes a new training approach that enables prevailing neural network training techniques to be applied for memristor-based neuromorphic networks. Particularly, we introduce momentum and adaptive learning rate to the circuit training, both of which are proven methods that significantly accelerate the convergence of neural network parameters. Furthermore, we show that this circuit can be used for neural networks with arbitrary numbers of layers, neurons, and parameters. Simulation results on four classification tasks demonstrate that the proposed circuit achieves both high accuracy and fast speed. Compared with the SGD-based training circuit, on the WBC data set, the training speed of our circuit is increased by 37.2% while the accuracy is only reduced by 0.77%. On the MNIST data set, the new circuit even leads to improved accuracy.

Keywords: Adaptive learning rate; Memristor; Neural network.

MeSH terms

Humans
Learning
Motion*
Neural Networks, Computer*
Neurons / physiology