Background: Cost data in healthcare are often skewed across patients. Thus, researchers have used either a log transformation of the dependent variable or generalized linear models with log links. However, frequently these non-linear approaches produce non-linear incremental effects: the incremental effects differ at different levels of the covariates, and this can cause dramatic effects on predicted cost.
Objectives: The aim of this study was to demonstrate that when modelling skewed data, log link functions or log transformations are not necessary and have unintended effects.
Methods: We simulated cost data using a linear model with a 'treatment', a covariate and a specified number of observations with excessive cost (skewed data). We also used actual data from a pain-relief intervention among hip-replacement patients. We then estimated cost models using various functional approaches suggested to handle skew and calculated the incremental cost of treatment at various levels of the covariate(s).
Results: All of these methods provide unbiased estimates of the incremental effect of treatment on costs at the mean level of the covariate. However, in some log-based models the implied incremental treatment cost doubled between extreme low and high values of the covariate in a manner inconsistent with the underlying linear model.
Conclusions: Although specification checks are always needed, the potential for misleading incremental estimates resulting from log-based specifications is often ignored. In this era of cost containment and comparisons of treatment effectiveness it is vital that researchers and policymakers understand the limitation of the inferences that can be made using log-based models for patients whose characteristics differ from the sample mean.