Plausible-Value Imputation Statistics for Detecting Item Misfit

Appl Psychol Meas. 2017 Jul;41(5):372-387. doi: 10.1177/0146621617692079. Epub 2017 Feb 1.

Abstract

When tests consist of a small number of items, the use of latent trait estimates for secondary analyses is problematic. One area in particular where latent trait estimates have been problematic is when testing for item misfit. This article explores the use of plausible-value imputations to lessen the severity of the inherent measurement unreliability in shorter tests, and proposes a parametric bootstrap procedure to generate empirical sampling characteristics for null-hypothesis tests of item fit. Simulation results suggest that the proposed item-fit statistics provide conservative to nominal error detection rates. Power to detect item misfit tended to be less than Stone's χ2* item-fit statistic but higher than the S-X2 statistic proposed by Orlando and Thissen, especially in tests with 20 or more dichotomously scored items.

Keywords: item response theory; item-fit statistics; parametric bootstrap; plausible-value imputation.