Estimation of Sobol's Sensitivity Indices under Generalized Linear Models

Commun Stat Theory Methods. 2018;47(21):5163-5195. doi: 10.1080/03610926.2017.1388397. Epub 2017 Nov 20.

Abstract

We derive explicit formulas for Sobol's sensitivity indices (SSIs) under the generalized linear models (GLMs) with independent or multivariate normal inputs. We argue that the main-effect SSIs provide a powerful tool for variable selection under GLMs with identity links under polynomial regressions. We also show via examples that the SSI-based variable selection results are similar to the ones obtained by the random forest algorithm but without the computational burden of data permutation. Finally, applying our results to the problem of gene network discovery, we identify though the SSI analysis of a public microarray dataset several novel higher-order gene-gene interactions missed out by the more standard inference methods. The relevant functions for SSI analysis derived here under GLMs with identity, log, and logit links are implemented and made available in the R package SobolSensitivity.

Keywords: Sobol’s indices; correlated inputs; gene-gene interactions; generalized linear models; global sensitivity analysis; variable ranking; variable selection.