A semi-parametric Bayesian model for semi-continuous longitudinal data

Stat Med. 2022 Jun 15;41(13):2354-2374. doi: 10.1002/sim.9359. Epub 2022 Mar 10.

Abstract

Semi-continuous data present challenges in both model fitting and interpretation. Parametric distributions may be inappropriate for extreme long right tails of the data. Mean effects of covariates, susceptible to extreme values, may fail to capture relevant information for most of the sample. We propose a two-component semi-parametric Bayesian mixture model, with the discrete component captured by a probability mass (typically at zero) and the continuous component of the density modeled by a mixture of B-spline densities that can be flexibly fit to any data distribution. The model includes random effects of subjects to allow for application to longitudinal data. We specify prior distributions on parameters and perform model inference using a Markov chain Monte Carlo (MCMC) Gibbs-sampling algorithm programmed in R. Statistical inference can be made for multiple quantiles of the covariate effects simultaneously providing a comprehensive view. Various MCMC sampling techniques are used to facilitate convergence. We demonstrate the performance and the interpretability of the model via simulations and analyses on the National Consortium on Alcohol and Neurodevelopment in Adolescence study (NCANDA) data on alcohol binge drinking.

Keywords: B-spline; Bayesian; Markov chain Monte Carlo; longitudinal; semi-continuous; semi-parametric.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Bayes Theorem
  • Humans
  • Markov Chains
  • Models, Statistical*
  • Monte Carlo Method