Analysis of repeated measures data with clumping at zero

Stat Methods Med Res. 2002 Aug;11(4):341-55. doi: 10.1191/0962280202sm291ra.


Longitudinal or repeated measures data with clumping at zero occur in many applications in biometrics, including health policy research, epidemiology, nutrition, and meteorology. These data exhibit correlation because they are measured on the same subject over time or because subjects may be considered repeated measures within a larger unit such as a family. They present special challenges because of the extreme non-normality of the distributions involved. A model for repeated measures data with clumping at zero, using a mixed-effects mixed-distribution model with correlated random effects, is presented. The model contains components to model the probability of a nonzero value and the mean of nonzero values, allowing for repeated measurements using random effects and allowing for correlation between the two components. Methods for describing the effect of predictor variables on the probability of nonzero values, on the mean of nonzero values, and on the overall mean amount are given. This interpretation also applies to the mixed-distribution model for cross-sectional data. The proposed methods are illustrated with analyses of effects of several covariates on medical expenditures in 1996 for subjects clustered within households using data from the Medical Expenditure Panel Survey.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Biometry / methods*
  • Cross-Sectional Studies
  • Data Interpretation, Statistical*
  • Health Expenditures / statistics & numerical data*
  • Models, Statistical*
  • Probability
  • United States