A comparison of computer adaptive tests (CATs) and short forms in terms of accuracy and number of items administrated using PROMIS profile

Qual Life Res. 2020 Jan;29(1):213-221. doi: 10.1007/s11136-019-02312-8. Epub 2019 Oct 8.


Purpose: In the Patient-Reported Outcomes Measurement Information System (PROMIS), seven domains (Physical Function, Anxiety, Depression, Fatigue, Sleep Disturbance, Social Function, and Pain Interference) are packaged together as profiles. Each of these domains can also be assessed using computer adaptive tests (CATs) or short forms (SFs) of varying length (e.g., 4, 6, and 8 items). We compared the accuracy and number of items administrated of CAT versus each SF.

Methods: PROMIS instruments are scored using item response theory (IRT) with graded response model and reported as T scores (mean = 50, SD = 10). We simulated 10,000 subjects from the normal distribution with mean 60 for symptom scales and 40 for function scales, and standard deviation 10 in each domain. We considered a subject's score to be accurate when the standard error (SE) was less than 3.0. We recorded range of accurate scores (accurate range) and the number of items administrated.

Results: The average number of items administrated in CAT was 4.7 across all domains. The accurate range was wider for CAT compared to all SFs in each domain. CAT was notably better at extending the accurate range into very poor health for Fatigue, Physical Function, and Pain Interference. Most SFs provided reasonably wide accurate range.

Conclusions: Relative to SFs, CATs provided the widest accurate range, with slightly more items than SF4 and less than SF6 and SF8. Most SFs, especially longer ones, provided reasonably wide accurate range.

Keywords: Computer adaptive testing (CAT); Item response theory; PROMIS; Short form.

MeSH terms

  • Adult
  • Computer-Aided Design / standards*
  • Female
  • Humans
  • Male
  • Quality of Life / psychology*