Multiple imputation for multivariate data with missing and below-threshold measurements: time-series concentrations of pollutants in the Arctic

Biometrics. 2001 Mar;57(1):22-33. doi: 10.1111/j.0006-341x.2001.00022.x.

Abstract

Many chemical and environmental data sets are complicated by the existence of fully missing values or censored values known to lie below detection thresholds. For example, week-long samples of airborne particulate matter were obtained at Alert, NWT, Canada, between 1980 and 1991, where some of the concentrations of 24 particulate constituents were coarsened in the sense of being either fully missing or below detection limits. To facilitate scientific analysis, it is appealing to create complete data by filling in missing values so that standard complete-data methods can be applied. We briefly review commonly used strategies for handling missing values and focus on the multiple-imputation approach, which generally leads to valid inferences when faced with missing data. Three statistical models are developed for multiply imputing the missing values of airborne particulate matter. We expect that these models are useful for creating multiple imputations in a variety of incomplete multivariate time series data sets.

Publication types

  • Comparative Study

MeSH terms

  • Air Pollutants / analysis*
  • Arctic Regions
  • Biometry
  • Data Interpretation, Statistical
  • Models, Statistical
  • Multivariate Analysis*
  • Northwest Territories
  • Time Factors

Substances

  • Air Pollutants