A Bayesian multistage spatio-temporally dependent model for spatial clustering and variable selection

Stat Med. 2023 Nov 20;42(26):4794-4823. doi: 10.1002/sim.9889. Epub 2023 Aug 31.


In spatio-temporal epidemiological analysis, it is of critical importance to identify the significant covariates and estimate the associated time-varying effects on the health outcome. Due to the heterogeneity of spatio-temporal data, the subsets of important covariates may vary across space and the temporal trends of covariate effects could be locally different. However, many spatial models neglected the potential local variation patterns, leading to inappropriate inference. Thus, this article proposes a flexible Bayesian hierarchical model to simultaneously identify spatial clusters of regression coefficients with common temporal trends, select significant covariates for each spatial group by introducing binary entry parameters and estimate spatio-temporally varying disease risks. A multistage strategy is employed to reduce the confounding bias caused by spatially structured random components. A simulation study demonstrates the outperformance of the proposed method, compared with several alternatives based on different assessment criteria. The methodology is motivated by two important case studies. The first concerns the low birth weight incidence data in 159 counties of Georgia, USA, for the years 2007 to 2018 and investigates the time-varying effects of potential contributing covariates in different cluster regions. The second concerns the circulatory disease risks across 323 local authorities in England over 10 years and explores the underlying spatial clusters and associated important risk factors.

Keywords: Bayesian hierarchical model; spatial clustering; spatial confounding problem; spatio-temporal modeling; variable selection.