Purpose: To summarize the tool characteristics, sources of validity evidence, methodological quality, and reporting quality for studies of technology-enhanced simulation-based assessments for health professions learners.
Method: The authors conducted a systematic review, searching MEDLINE, EMBASE, CINAHL, ERIC, PsychINFO, Scopus, key journals, and previous reviews through May 2011. They selected original research in any language evaluating simulation-based assessment of practicing and student physicians, nurses, and other health professionals. Reviewers working in duplicate evaluated validity evidence using Messick's five-source framework; methodological quality using the Medical Education Research Study Quality Instrument and the revised Quality Assessment of Diagnostic Accuracy Studies; and reporting quality using the Standards for Reporting Diagnostic Accuracy and Guidelines for Reporting Reliability and Agreement Studies.
Results: Of 417 studies, 350 (84%) involved physicians at some stage in training. Most focused on procedural skills, including minimally invasive surgery (N=142), open surgery (81), and endoscopy (67). Common elements of validity evidence included relations with trainee experience (N=306), content (142), relations with other measures (128), and interrater reliability (124). Of the 217 studies reporting more than one element of evidence, most were judged as having high or unclear risk of bias due to selective sampling (N=192) or test procedures (132). Only 64% proposed a plan for interpreting the evidence to be presented (validity argument).
Conclusions: Validity evidence for simulation-based assessments is sparse and is concentrated within specific specialties, tools, and sources of validity evidence. The methodological and reporting quality of assessment studies leaves much room for improvement.