Missing... presumed at random: cost-analysis of incomplete data

Health Econ. 2003 May;12(5):377-92. doi: 10.1002/hec.766.


When collecting patient-level resource use data for statistical analysis, for some patients and in some categories of resource use, the required count will not be observed. Although this problem must arise in most reported economic evaluations containing patient-level data, it is rare for authors to detail how the problem was overcome. Statistical packages may default to handling missing data through a so-called 'complete case analysis', while some recent cost-analyses have appeared to favour an 'available case' approach. Both of these methods are problematic: complete case analysis is inefficient and is likely to be biased; available case analysis, by employing different numbers of observations for each resource use item, generates severe problems for standard statistical inference. Instead we explore imputation methods for generating 'replacement' values for missing data that will permit complete case analysis using the whole data set and we illustrate these methods using two data sets that had incomplete resource use information.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Costs and Cost Analysis
  • Data Interpretation, Statistical*
  • Diabetes Mellitus, Type 2 / therapy
  • Efficiency
  • Humans
  • Laser Therapy / statistics & numerical data
  • Length of Stay / statistics & numerical data
  • Likelihood Functions
  • Male
  • Patient Dropouts
  • Randomized Controlled Trials as Topic / economics
  • Randomized Controlled Trials as Topic / statistics & numerical data*
  • Research Design*
  • Transurethral Resection of Prostate / statistics & numerical data