Evaluating Testing, Profile Likelihood Confidence Interval Estimation, and Model Comparisons for Item Covariate Effects in Linear Logistic Test Models

Sun-Joo Cho; Paul De Boeck; Woo-Yeol Lee

doi:10.1177/0146621617692078

Evaluating Testing, Profile Likelihood Confidence Interval Estimation, and Model Comparisons for Item Covariate Effects in Linear Logistic Test Models

Appl Psychol Meas. 2017 Jul;41(5):353-371. doi: 10.1177/0146621617692078. Epub 2017 Feb 1.

Authors

Sun-Joo Cho¹, Paul De Boeck², Woo-Yeol Lee¹

Affiliations

¹ Vanderbilt University, Nashville, TN, USA.
² Ohio State University, Columbus, OH, USA.

Abstract

The linear logistic test model (LLTM) has been widely applied to investigate the effects of item covariates on item difficulty. The LLTM was extended with random item residuals to account for item differences not explained by the item covariates. This extended LLTM is called the LLTM-R. In this article, statistical inference methods are investigated for these two models. Type I error rates and power are compared via Monte Carlo studies. Based on the simulation results, the use of the likelihood ratio test (LRT) is recommended over the paired-sample t test based on sum scores, the Wald z test, and information criteria, and the LRT is recommended over the profile likelihood confidence interval because of the simplicity of the LRT. In addition, it is concluded that the LLTM-R is the better general model approach. Inferences based on the LLTM while the LLTM-R is the true model appear to be largely biased in the liberal way, while inferences based on the LLTM-R while the LLTM is the true model are only biased in a very minor and conservative way. Furthermore, in the absence of residual variance, Type I error rate and power were acceptable except for power when the number of items is small (10 items) and also the number of persons is small (200 persons). In the presence of residual variance, however, the number of items needs to be large (80 items) to avoid an inflated Type I error and to reach a power level of .90 for a moderate effect.

Keywords: linear logistic test model; model comparison approach; profile likelihood confidence interval; random item residuals; statistical testing.