An effective SteinGLM initialization scheme for training multi-layer feedforward sigmoidal neural networks

Zebin Yang; Hengtao Zhang; Agus Sudjianto; Aijun Zhang

doi:10.1016/j.neunet.2021.02.014

An effective SteinGLM initialization scheme for training multi-layer feedforward sigmoidal neural networks

Neural Netw. 2021 Jul:139:149-157. doi: 10.1016/j.neunet.2021.02.014. Epub 2021 Feb 27.

Authors

Zebin Yang¹, Hengtao Zhang¹, Agus Sudjianto², Aijun Zhang³

Affiliations

¹ Department of Statistics and Actuarial Science, The University of Hong Kong, Pokfulam Road, Hong Kong.
² Corporate Model Risk, Wells Fargo, USA.
³ Corporate Model Risk, Wells Fargo, USA. Electronic address: ajzhang@umich.edu.

PMID: 33706228
DOI: 10.1016/j.neunet.2021.02.014

Abstract

Network initialization is the first and critical step for training neural networks. In this paper, we propose a novel network initialization scheme based on the celebrated Stein's identity. By viewing multi-layer feedforward sigmoidal neural networks as cascades of multi-index models, the projection weights to the first hidden layer are initialized using eigenvectors of the cross-moment matrix between the input's second-order score function and the response. The input data is then forward propagated to the next layer and such a procedure can be repeated until all the hidden layers are initialized. Finally, the weights for the output layer are initialized by generalized linear modeling. Such a proposed SteinGLM method is shown through extensive numerical results to be much faster and more accurate than other popular methods commonly used for training neural networks.

Keywords: Generalized linear model; Initialization scheme; Multi-index model; Multi-layer feedforward neural network; Stein’s identity.

MeSH terms

Machine Learning / standards*
Software