Context: Cut-scores, reliability and validity vary among standard-setting methods. The modified Angoff method (MA) is a well-known standard-setting procedure, but the three-level Angoff approach (TLA), a recent modification, has not been extensively evaluated.
Objectives: This study aimed to compare standards and pass rates in an objective structured clinical examination (OSCE) obtained using two methods of standard setting with discussion and reality checking, and to assess the reliability and validity of each method.
Methods: A sample of 105 medical students participated in a 14-station OSCE. Fourteen and 10 faculty members took part in the MA and TLA procedures, respectively. In the MA, judges estimated the probability that a borderline student would pass each station. In the TLA, judges estimated whether a borderline examinee would perform the task correctly or not. Having given individual ratings, judges discussed their decisions. One week after the examination, the procedure was repeated using normative data.
Results: The mean score for the total test was 54.11% (standard deviation: 8.80%). The MA cut-scores for the total test were 49.66% and 51.52% after discussion and reality checking, respectively (the consequent percentages of passing students were 65.7% and 58.1%, respectively). The TLA yielded mean pass scores of 53.92% and 63.09% after discussion and reality checking, respectively (rates of passing candidates were 44.8% and 12.4%, respectively). Compared with the TLA, the MA showed higher agreement between judges (0.94 versus 0.81) and a narrower 95% confidence interval in standards (3.22 versus 11.29).
Conclusions: The MA seems a more credible and reliable procedure with which to set standards for an OSCE than does the TLA, especially when a reality check is applied.
© Blackwell Publishing Ltd 2011.