AbstractSeveral statistical issues associated with health care costs, such as heteroscedasticity and severe skewness, make it challenging to estimate or predict medical costs. When the interest is modeling the mean cost, it is desirable to make no assumption on the density function or higher order moments. Another challenge in developing cost prediction models is the presence of many covariates, making it necessary to apply variable selection methods to achieve a balance of prediction accuracy and model simplicity. We propose Spike-or-Slab priors for Bayesian variable selection based on asymptotic normal estimates of the full model parameters that are consistent as long as the assumption on the mean cost is satisfied. In addition, the scope of model searching can be reduced by ranking the Z-statistics. This method possesses four advantages simultaneously: robust (due to avoiding assumptions on the density function or higher order moments), parsimonious (feature of variable selection), informative (due to its Bayesian flavor, which can compare posterior probabilities of candidate models) and efficient (by reducing model searching scope with the use of Z-ranking). We apply this method to the Medical Expenditure Panel Survey dataset.
Keywords: Spike-or-Slab prior; health econometrics; sandwich variance estimator; variable selection.