Purpose: Item response theory (IRT) scoring provides T-scores for physical and mental health subscales on the Patient-Reported Outcomes Measurement Information System Global Health questionnaire (PROMIS-GH) even when relevant items are skipped. We compared different item- and score-level imputation methods for estimating T-scores to the current scoring method.
Methods: Missing PROMIS-GH items were simulated using a dataset of complete PROMIS-GH scales collected at a single tertiary care center. Four methods were used to estimate T-scores with missing item scores: (1) IRT-based scoring of available items (IRTavail), (2) item-level imputation using predictive mean matching (PMM), (3) item-level imputation using proportional odds logistic regression (POLR), and (4) T-score-level imputation (IMPdirect). Performance was assessed using root mean squared error (RMSE) and mean absolute error (MAE) of T-scores and comparing estimated regression coefficients from the four methods to the complete data model. Different proportions of missingness and sample sizes were examined.
Results: IRTavail had lowest RMSE and MAE for mental health T-scores while PMM had lowest RMSE and MAE for physical health T-scores. For both physical and mental health T-scores, regression coefficients estimated from imputation methods were closer to those of the complete data model.
Conclusions: The available item scoring method produced more accurate PROMIS-GH mental but less accurate physical T-scores, compared to imputation methods. Using item-level imputation strategies may result in regression coefficient estimates closer to those of the complete data model when nonresponse rate is high. The choice of method may depend on the application, sample size, and amount of missingness.
Keywords: Item nonresponse; Item response theory; Multiple imputation; PROMIS Global Health; Simulation.