Using Automated Scoring to Evaluate Written Responses in English and French on a High-Stakes Clinical Competency Examination

Eval Health Prof. 2016 Mar;39(1):100-13. doi: 10.1177/0163278715605358. Epub 2015 Sep 16.


We present a framework for technology-enhanced scoring of bilingual clinical decision-making (CDM) questions using an open-source scoring technology and evaluate the strength of the proposed framework using operational data from the Medical Council of Canada Qualifying Examination. Candidates' responses from six write-in CDM questions were used to develop a three-stage-automated scoring framework. In Stage 1, the linguistic features from CDM responses were extracted. In Stage 2, supervised machine learning techniques were employed for developing the scoring models. In Stage 3, responses to six English and French CDM questions were scored using the scoring models from Stage 2. Of the 8,007 English and French CDM responses, 7,643 were accurately scored with an agreement rate of 95.4% between human and computer scoring. This result serves as an improvement of 5.4% when compared with the human inter-rater reliability. Our framework yielded scores similar to those of expert physician markers and could be used for clinical competency assessment.

Keywords: automated essay scoring; clinical competency assessment; clinical decision making; computerized assessment and evaluation; medical licensing examination.

MeSH terms

  • Canada
  • Clinical Competence*
  • Clinical Decision-Making
  • Educational Measurement / methods*
  • Educational Measurement / standards*
  • Electronic Data Processing / standards*
  • Humans
  • Licensure, Medical
  • Reproducibility of Results
  • Translating*