A bivariate measurement error model for semicontinuous and continuous variables: Application to nutritional epidemiology

Biometrics. 2016 Mar;72(1):106-15. doi: 10.1111/biom.12377. Epub 2015 Aug 31.

Abstract

Semicontinuous data in the form of a mixture of a large portion of zero values and continuously distributed positive values frequently arise in many areas of biostatistics. This article is motivated by the analysis of relationships between disease outcomes and intakes of episodically consumed dietary components. An important aspect of studies in nutritional epidemiology is that true diet is unobservable and commonly evaluated by food frequency questionnaires with substantial measurement error. Following the regression calibration approach for measurement error correction, unknown individual intakes in the risk model are replaced by their conditional expectations given mismeasured intakes and other model covariates. Those regression calibration predictors are estimated using short-term unbiased reference measurements in a calibration substudy. Since dietary intakes are often "energy-adjusted," e.g., by using ratios of the intake of interest to total energy intake, the correct estimation of the regression calibration predictor for each energy-adjusted episodically consumed dietary component requires modeling short-term reference measurements of the component (a semicontinuous variable), and energy (a continuous variable) simultaneously in a bivariate model. In this article, we develop such a bivariate model, together with its application to regression calibration. We illustrate the new methodology using data from the NIH-AARP Diet and Health Study (Schatzkin et al., 2001, American Journal of Epidemiology 154, 1119-1125), and also evaluate its performance in a simulation study.

Keywords: Bivariate modeling; Episodically consumed dietary components; Measurement error; Nutritional epidemiology; Regression calibration; Semicontinuous variables.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Data Interpretation, Statistical*
  • Diet / statistics & numerical data*
  • Energy Intake
  • Humans
  • Models, Statistical*
  • Nutrition Assessment*
  • Reproducibility of Results
  • Sample Size
  • Sensitivity and Specificity
  • United States / epidemiology