Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2020 Jan 10;15(1):e0227196.
doi: 10.1371/journal.pone.0227196. eCollection 2020.

Adapting Cognitive Diagnosis Computerized Adaptive Testing Item Selection Rules to Traditional Item Response Theory

Affiliations
Free PMC article
Comparative Study

Adapting Cognitive Diagnosis Computerized Adaptive Testing Item Selection Rules to Traditional Item Response Theory

Miguel A Sorrel et al. PLoS One. .
Free PMC article

Abstract

Currently, there are two predominant approaches in adaptive testing. One, referred to as cognitive diagnosis computerized adaptive testing (CD-CAT), is based on cognitive diagnosis models, and the other, the traditional CAT, is based on item response theory. The present study evaluates the performance of two item selection rules (ISRs) originally developed in the CD-CAT framework, the double Kullback-Leibler information (DKL) and the generalized deterministic inputs, noisy "and" gate model discrimination index (GDI), in the context of traditional CAT. The accuracy and test security associated with these two ISRs are compared to those of the point Fisher information and weighted KL using a simulation study. The impact of the trait level estimation method is also investigated. The results show that the new ISRs, particularly DKL, could be used to improve the accuracy of CAT. Better accuracy for DKL is achieved at the expense of higher item overlap rate. Differences among the item selection rules become smaller as the test gets longer. The two CD-CAT ISRs select different types of items: items with the highest possible a parameter with DKL, and items with the lowest possible c parameter with GDI. Regarding the trait level estimator, expected a posteriori method is generally better in the first stages of the CAT, and converges with the maximum likelihood method when a medium to large number of items are involved. The use of DKL can be recommended in low-stakes settings where test security is less of a concern.

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. DKL for two fictitious items.
Both items share the same b parameter (bA = bB = 0), but differ in their discrimination (aA = 1; aB = 2) and pseudo-guessing (cA = 0; cB = 0.30) parameters. Two item positions are illustrated: Item position 3 and 21. At item position 3, the examinee has one correct and one incorrect response. At item position 21, the examinee has ten correct and ten incorrect responses. The color gradient represents the weight to be applied to the KL and DKL computation based on the examinee’s likelihood function. θ is represented as z.
Fig 2
Fig 2. GDI for two fictitious items.
Both items share the same b parameter (bA = bB = 0), but differ in their discrimination (aA = 1; aB = 2) and pseudo-guessing (cA = 0; cB = 0.30) parameters. Two item positions are illustrated: Item position 3 and 21. At item position 3, the examinee has one correct and one incorrect response. At item position 21, the examinee has ten correct and ten incorrect responses. The color gradient represents the weight to be applied to the GDI computation based on the examinee’s likelihood function. θ is represented as z.
Fig 3
Fig 3. RMSE and overlap rate according to item selection rule, item position, and trait level estimator.
Fig 4
Fig 4. Mean a and c parameters of the administered items according to item selection rule, item position, and trait level estimator.
Fig 5
Fig 5. Correlation between the item exposure rates of the different item selection rules according to item position and trait level estimator.

Similar articles

See all similar articles

References

    1. Magis D, Yan D, von Davier AA. Computerized adaptive and multistage testing with R. Cham: Springer International Publishing; 2017. 10.1007/978-3-319-69218-0 - DOI
    1. Gibbons RD, Weiss DJ, Frank E, Kupfer D. Computerized adaptive diagnosis and testing of mental health disorders. Annu Rev Clin Psychol. 2016;12: 83–104. 10.1146/annurev-clinpsy-021815-093634 - DOI - PubMed
    1. Barney M, Fisher WP. Adaptive measurement and assessment. Annu Rev Organ Psychol Organ Behav. 2016;3: 469–490. 10.1146/annurev-orgpsych-041015-062329 - DOI
    1. Stark S, Chernyshenko OS, Drasgow F, Nye CD, White LA, Heffner T, et al. From ABLE to TAPAS: A new generation of personality tests to support military selection and classification decisions. Mil Psychol. 2014;26: 153–164. 10.1037/mil0000044 - DOI
    1. Gibbons RD, Weiss DJ, Pilkonis PA, Frank E, Moore T, Kim JB, et al. Development of a computerized adaptive test for depression. Arch Gen Psychiatry. 2012;69: 1104–1112. 10.1001/archgenpsychiatry.2012.14 - DOI - PMC - PubMed

Publication types

Grant support

This research was supported by Grant PSI2017-85022-P (Ministerio de Ciencia, Innovación y Universidades, Spain) and the UAM-IIC Chair «Psychometric Models and Applications». There was no additional external funding received for this study.
Feedback