Objective: The International Classification of Functioning, Disability and Health (ICF) is used increasingly to describe and classify functioning in medicine without being a psychometrically sound measure. All categories of the ICF are quantified using the same generic 0-4 scale. The objective of this study was to assess observer agreement when describing and classifying functioning with the ICF.
Design: A second-level category of the ICF, d430 lifting and carrying objects, was used as an example. To the qualifiers of this category, clinically meaningful definitions were assigned. Data were collected in a cross-sectional survey with repeated measurement. We report raw, specific and chance-corrected measures or agreement, a graphical method and the results of log-linear models for ordinal agreement.
Subjects/patients: A convenience sample of patients requiring physical therapy in an acute hospital.
Results: Twenty-five patients were assessed twice by 2 observers. Raw agreement was 0.52. Kappa was 0.36, indicating fair agreement. Different hierarchical log-linear models indicated that the strength of agreement was not homogeneous over all categories.
Conclusion: Observer agreement has to be evaluated when describing and classifying functioning using the ICF Qualifiers'scale. When assessing inter-observer reliability, the first step is to calculate a summary statistic. Modelling agreement yields valuable insight into the structure of a contingency table, which can lead to further improvement of the scale.