Purpose: Health state utilities measured by the major multi-attribute utility instruments differ. Understanding the reasons for this is important for the choice of instrument and for research designed to reconcile these differences. This paper investigates these reasons by explaining pairwise differences between utilities derived from six multi-attribute utility instruments in terms of (1) their implicit measurement scales; (2) the structure of their descriptive systems; and (3) 'micro-utility effects', scale-adjusted differences attributable to their utility formula.
Methods: The EQ-5D-5L, SF-6D, HUI 3, 15D and AQoL-8D were administered to 8,019 individuals. Utilities and unweighted values were calculated using each instrument. Scale effects were determined by the linear relationship between utilities, the effect of the descriptive system by comparison of scale-adjusted values and 'micro-utility effects' by the unexplained difference between utilities and values.
Results: Overall, 66 % of the differences between utilities was attributable to the descriptive systems, 30.3 % to scale effects and 3.7 % to micro-utility effects.
Discussion: Results imply that the revision of utility algorithms will not reconcile differences between instruments. The dominating importance of the descriptive system highlights the need for researchers to select the instrument most capable of describing the health states relevant for a study.
Conclusions: Reconciliation of inconsistent utilities produced by different instruments must focus primarily upon the content of the descriptive system. Utility weights primarily determine the measurement scale. Other differences, attributable to utility formula, are comparatively unimportant.