Generalisability: a key to unlock professional assessment

Med Educ. 2002 Oct;36(10):972-8. doi: 10.1046/j.1365-2923.2002.01320.x.


Context: Reliability is defined as the extent to which a result reflects all possible measurements of the same construct. It is an essential measurement characteristic. Unfortunately, there are few objective tests for the most important aspects of the professional role because they are complex and intangible. In addition, professional performance varies markedly from setting to setting and case to case. Both these factors threaten reliability.

Aim: This paper describes the classical approach to evaluating reliability and points out the limitations of this approach. It goes on to describe how generalisability theory solves many of these limitations.

Conditions: A G-study uses variance component analysis to measure the contributions that all relevant factors make to the result (observer, situation, case, assessee and their interactions). This information can be combined to reflect the reliability of a single observation as a reflection of all possible measurements - a true reflection of reliability. It can also be used to estimate the reliability of a combined sample of several different observations, or to predict how many observations are required with different test formats to achieve a given level of reliability. Worked examples are used to illustrate the concepts.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Analysis of Variance
  • Clinical Competence / standards*
  • Education, Medical, Undergraduate / standards*
  • Humans
  • Observer Variation
  • Reproducibility of Results