The distributions of medical costs are often skewed to the right because small numbers of patients use large amounts of health care resources. Using data from a study of colon cancer costs, we show, by example, the impact and magnitude of outliers and influential observations on health care costs and compared the effects of statistical costing methods for addressing the disproportionate influence of outliers and influential observations. We used data from a retrospective cohort study of 3,842 elderly veterans with colon cancer who were enrolled in and used health care from, both the Department of Veterans Affairs and Medicare in 1999-2004. After calculating the average colon cancer episode cost and distribution for the full cohort, we used box-plot methods, Winsorization, DFBETAs, and Cook's distance to identify and assess or adjust the outlying and/or influential observations. The number of observations identified as outlying and/or influential ranged from 13 when the predicted DFBETA measurement was greater than 0.15 and the observation was a qualified box-plot outlier to 384 cases using the Winsorization method at the 5th and 95th percentiles. Average costs of colon cancer episodes using these methods were similar. The method of choice from the results of this particular analysis can be conditionally based on whether the purpose is to control only for influential observations or to simultaneously control for outliers and influential observations. Understanding how estimates could change with each approach is important in assessing the impact of a particular method on the results.
Keywords: Colon cancer; Episode of care; Health care costs; Influential observations; Outliers.