Objective: Measure modification can impact comparability of scores across groups and settings. Changes in items can affect the percent admitting to a symptom.
Methods: Using item response theory (IRT) methods, well-calibrated items can be used interchangeably, and the exact same item does not have to be administered to each respondent, theoretically permitting wider latitude in terms of modification.
Results: Recommendations regarding modifications vary, depending on the use of the measure. In the context of research, adjustments can be made at the analytic level by freeing and fixing parameters based on findings of differential item functioning (DIF). The consequences of DIF for clinical decision making depend on whether or not the patient's performance level approaches the scale decision cutpoint. High-stakes testing may require item removal or separate calibrations to ensure accurate assessment.
Discussion: Guidelines for modification based on DIF analyses and illustrations of the impact of adjustments are presented.