Extreme Learning Machine for Multilayer Perceptron

Jiexiong Tang; Chenwei Deng; Guang-Bin Huang

doi:10.1109/TNNLS.2015.2424995

Extreme Learning Machine for Multilayer Perceptron

IEEE Trans Neural Netw Learn Syst. 2016 Apr;27(4):809-21. doi: 10.1109/TNNLS.2015.2424995. Epub 2015 May 7.

Authors

Jiexiong Tang, Chenwei Deng, Guang-Bin Huang

PMID: 25966483
DOI: 10.1109/TNNLS.2015.2424995

Abstract

Extreme learning machine (ELM) is an emerging learning algorithm for the generalized single hidden layer feedforward neural networks, of which the hidden node parameters are randomly generated and the output weights are analytically computed. However, due to its shallow architecture, feature learning using ELM may not be effective for natural signals (e.g., images/videos), even with a large number of hidden nodes. To address this issue, in this paper, a new ELM-based hierarchical learning framework is proposed for multilayer perceptron. The proposed architecture is divided into two main components: 1) self-taught feature extraction followed by supervised feature classification and 2) they are bridged by random initialized hidden weights. The novelties of this paper are as follows: 1) unsupervised multilayer encoding is conducted for feature extraction, and an ELM-based sparse autoencoder is developed via l1 constraint. By doing so, it achieves more compact and meaningful feature representations than the original ELM; 2) by exploiting the advantages of ELM random feature mapping, the hierarchically encoded outputs are randomly projected before final decision making, which leads to a better generalization with faster learning speed; and 3) unlike the greedy layerwise training of deep learning (DL), the hidden layers of the proposed framework are trained in a forward manner. Once the previous layer is established, the weights of the current layer are fixed without fine-tuning. Therefore, it has much better learning efficiency than the DL. Extensive experiments on various widely used classification data sets show that the proposed algorithm achieves better and faster convergence than the existing state-of-the-art hierarchical learning methods. Furthermore, multiple applications in computer vision further confirm the generality and capability of the proposed learning scheme.

Publication types

Research Support, Non-U.S. Gov't