Assessment in the context of uncertainty: how many members are needed on the panel of reference of a script concordance test?

Med Educ. 2005 Mar;39(3):284-91. doi: 10.1111/j.1365-2929.2005.02092.x.


Purpose: The script concordance test (SCT) assesses clinical reasoning in the context of uncertainty. Because there is no single correct answer, scoring is based on a comparison of answers provided by examinees with those provided by members of a panel of reference made up of experienced practitioners. This study aims to determine how many members are needed on the panel to obtain reliable scores to compare against the scores of examinees.

Methods: A group of 80 residents were tested on 73 items (Cronbach's alpha: 0.76). A total of 38 family doctors made up the pool of experienced practitioners, from which 1000 random panels of reference of increasing sizes (5, 10, 15, 20, 25 and 30) were generated with a resampling procedure. Residents' scores were computed for each panel sample. Units of analysis were means of residents' score, test reliability coefficient and correlation coefficient between scores obtained with a given panel of reference versus the scores obtained with the full panel of 38. Statistics were averaged across the 1000 samples for each panel size for the mean and test reliability computations, and across 100 samples for the correlation computation.

Results: For sample variability, there was a 3-fold increase in standard deviation of means between a sample panel size of 5 (SD=1.57) and a panel size of 30 (SD=0.50). For reliability, there was a large difference in precision between a panel size of 5 (0.62) and a panel size of 10 (0.70). When the panel size was over 20, the gain became negligible (0.74 for 20 and 0.76 for 38). For correlation, the mean correlation coefficient values were 0.90 with 5 panel members, 0.95 with 10 members and 0.98 with 20 members.

Conclusion: Any number over 10 is associated with acceptable reliability and good correlation between the samples versus the full panel of 38. For high stake examinations, using a panel of 20 members is recommended. Recruiting more than 20 panel members shows only a marginal benefit in terms of psychometric properties.

Publication types

  • Multicenter Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Clinical Competence / standards*
  • Education, Medical, Undergraduate*
  • France
  • Humans
  • Middle Aged
  • Observer Variation
  • Quebec
  • Reproducibility of Results