While checklists are often used to score standardized patient based clinical assessments, little research has focused on issues related to their development or the level of agreement with respect to the importance of specific items. Five physicians independently reviewed checklists from 11 simulation scenarios that were part of the former Educational Commission for Foreign Medical Graduate's Clinical Skills Assessment and classified the clinical appropriateness of each of the checklist items. Approximately 78% of the original checklist items were judged to be needed, or indicated, given the presenting complaint and the purpose of the assessment. Rater agreement was relatively poor with pairwise associations (Kappa coefficient) ranging from 0.09 to 0.29. However, when only consensus indicated items were included, there was little change in examinee scores, including their reliability over encounters. Although most checklist items in this sample were judged to be appropriate, some could potentially be eliminated, thereby minimizing the scoring burden placed on the standardized patients. Periodic review of checklist items, concentrating on their clinical importance, is warranted.