Questionnaire data may contain missing values because certain questions do not apply to all respondents. For instance, questions addressing particular attributes of a symptom, such as frequency, triggers or seasonality, are only applicable to those who have experienced the symptom, while for those who have not, responses to these items will be missing. This missing information does not fall into the category 'missing by design', rather the features of interest do not exist and cannot be measured regardless of survey design. Analysis of responses to such conditional items is therefore typically restricted to the subpopulation in which they apply. This article is concerned with joint multivariate modelling of responses to both unconditional and conditional items without restricting the analysis to this subpopulation. Such an approach is of interest when the distributions of both types of responses are thought to be determined by common parameters affecting the whole population. By integrating the conditional item structure into the model, inference can be based both on unconditional data from the entire population and on conditional data from subjects for whom they exist. This approach opens new possibilities for multivariate analysis of such data. We apply this approach to latent class modelling and provide an example using data on respiratory symptoms (wheeze and cough) in children. Conditional data structures such as that considered here are common in medical research settings and, although our focus is on latent class models, the approach can be applied to other multivariate models.
Copyright (c) 2009 John Wiley & Sons, Ltd.