Asymptotic Properties of Neural Network Sieve Estimators

J Nonparametr Stat. 2023;35(4):839-868. doi: 10.1080/10485252.2023.2209218. Epub 2023 May 13.


Neural networks have become one of the most popularly used methods in machine learning and artificial intelligence. Due to the universal approximation theorem (Hornik et al., 1989), a neural network with one hidden layer can approximate any continuous function on compact support as long as the number of hidden units is sufficiently large. Statistically, a neural network can be classified into a nonlinear regression framework. However, if we consider it parametrically, due to the unidentifiability of the parameters, it is difficult to derive its asymptotic properties. Instead, we consider the estimation problem in a nonparametric regression framework and use the results from sieve estimation to establish the consistency, the rates of convergence and the asymptotic normality of the neural network estimators. We also illustrate the validity of the theories via simulations.

Keywords: Empirical Processes; Entropy Integral.