Statistical significant change versus relevant or important change in (quasi) experimental design: some conceptual and methodological problems in estimating magnitude of intervention-related change in health services research

Int J Integr Care. 2002:2:e15. doi: 10.5334/ijic.65. Epub 2002 Dec 17.


This paper aims to identify problems in estimating and the interpretation of the magnitude of intervention-related change over time or responsiveness assessed with health outcome measures. Responsiveness is a problematic construct and there is no consensus on how to quantify the appropriate index to estimate change over time between baseline and post-test designs. This paper gives an overview of several responsiveness indices. Thresholds for effect size (or responsiveness index) interpretation were introduced some thirty years ago by Cohen who standardised the difference-scores (d) with the pooled standard deviation (d/SD(pooled)). However, many effect sizes (ES) have been introduced since Cohen's original work and in the formula of one of these ES, the mean change scores are standardised with the SD of those change scores (d/SD(change)). When health outcome questionnaires are used, this effect size is applied on a wide scale and is represented as the Standardized Response Mean (SRM). However, its interpretation is problematic when it is used as an estimate of magnitude of change over time and interpreted with the thresholds, set by Cohen for effect size (ES) which is based on SD(pooled). Thus, in the case of using the SRM, application of these well-known cut-off points for pooled standard deviation units namely: 'trivial' (ES < 0.20), 'small' (ES > or = 0.20 < 0.50), 'moderate' (ES > or = 0.50 < 0.80), or large (ES > or = 0.80), may lead to over- or underestimation of the magnitude of intervention-related change over time due to the correlation between baseline and outcome assessments. Consequently, taking Cohen's thresholds for granted for every version of effect size indices as estimates of intervention-related magnitude of change, may lead to over- or underestimation of this magnitude of intervention-related change over time.