A data-driven framework for sparsity-enhanced surrogates with arbitrary mutually dependent randomness

Comput Methods Appl Mech Eng. 2019 Jun 15;350:199-227. doi: 10.1016/j.cma.2019.03.014. Epub 2019 Mar 14.

Abstract

The challenge of quantifying uncertainty propagation in real-world systems is rooted in the high-dimensionality of the stochastic input and the frequent lack of explicit knowledge of its probability distribution. Traditional approaches show limitations for such problems, especially when the size of the training data is limited. To address these difficulties, we have developed a general framework of constructing surrogate models on spaces of stochastic input with arbitrary probability measure irrespective of the mutual dependencies between individual components of the random inputs and the analytical form. The present Data-driven Sparsity-enhancing Rotation for Arbitrary Randomness (DSRAR) framework includes a data-driven construction of multivariate polynomial basis for arbitrary mutually dependent probability measures and a sparsity enhancement rotation procedure. This sparsity-enhancing rotation method was initially proposed in our previous work [1] for Gaussian density distributions, which may not be feasible for non-Gaussian distributions due to the loss of orthogonality after the rotation. To remedy such difficulties, we developed a new data-driven approach to construct orthonormal polynomials for arbitrary mutually dependent randomness, ensuring the constructed basis maintains the orthogonality/near-orthogonality with respect to the density of the rotated random vector, where directly applying the regular polynomial chaos including arbitrary polynomial chaos (aPC) [2] shows limitations due to the assumption of the mutual independence between the components of the random inputs. The developed DSRAR framework leads to accurate recovery, with only limited training data, of a sparse representation of the target functions. The effectiveness of our method is demonstrated in challenging problems such as partial differential equations and realistic molecular systems within high-dimensional (O(10)) conformational spaces where the underlying density is implicitly represented by a large collection of sample data, as well as systems with explicitly given non-Gaussian probabilistic measures.

Keywords: arbitrary randomness; compressed sensing; data-driven; mutual dependence; sparsity enhancement; uncertainty quantification.