A taxonomy of effect size measures for the differential functioning of items and scales

J Appl Psychol. 2010 Jul;95(4):728-43. doi: 10.1037/a0018966.


Much progress has been made in the past 2 decades with respect to methods of identifying measurement invariance or a lack thereof. Until now, the focus of these efforts has been to establish criteria for statistical significance in items and scales that function differently across samples. The power associated with tests of differential functioning, as with all significance tests, is affected by sample size and other considerations. Additionally, statistical significance need not imply practical importance. There is a strong need as such for meaningful effect size indicators to describe the extent to which items and scales function differently. Recently developed effect size measures show promise for providing a metric to describe the amount of differential functioning present between groups. Expanding upon recent developments, this article presents a taxonomy of potential differential functioning effect sizes; several new indices of item and scale differential functioning effect size are proposed and illustrated with 2 data samples. Software created for computing these indices and graphing item- and scale-level differential functioning is described.

MeSH terms

  • Cross-Cultural Comparison
  • Data Collection / standards
  • Data Collection / statistics & numerical data
  • Data Interpretation, Statistical*
  • Humans
  • Models, Statistical
  • Psychological Tests / standards
  • Psychological Tests / statistics & numerical data*
  • Psychometrics / standards
  • Psychometrics / statistics & numerical data
  • Reference Values
  • Sample Size
  • Software / standards