Pitfalls and important issues in testing reliability using intraclass correlation coefficients in orthopaedic research
- PMID: 22662301
- PMCID: PMC3360188
- DOI: 10.4055/cios.2012.4.2.149
Pitfalls and important issues in testing reliability using intraclass correlation coefficients in orthopaedic research
Abstract
Background: Intra-class correlation coefficients (ICCs) provide a statistical means of testing the reliability. However, their interpretation is not well documented in the orthopedic field. The purpose of this study was to investigate the use of ICCs in the orthopedic literature and to demonstrate pitfalls regarding their use.
Methods: First, orthopedic articles that used ICCs were retrieved from the Pubmed database, and journal demography, ICC models and concurrent statistics used were evaluated. Second, reliability test was performed on three common physical examinations in cerebral palsy, namely, the Thomas test, the Staheli test, and popliteal angle measurement. Thirty patients were assessed by three orthopedic surgeons to explore the statistical methods testing reliability. Third, the factors affecting the ICC values were examined by simulating the data sets based on the physical examination data where the ranges, slopes, and interobserver variability were modified.
Results: Of the 92 orthopedic articles identified, 58 articles (63%) did not clarify the ICC model used, and only 5 articles (5%) described all models, types, and measures. In reliability testing, although the popliteal angle showed a larger mean absolute difference than the Thomas test and the Staheli test, the ICC of popliteal angle was higher, which was believed to be contrary to the context of measurement. In addition, the ICC values were affected by the model, type, and measures used. In simulated data sets, the ICC showed higher values when the range of data sets were larger, the slopes of the data sets were parallel, and the interobserver variability was smaller.
Conclusions: Care should be taken when interpreting the absolute ICC values, i.e., a higher ICC does not necessarily mean less variability because the ICC values can also be affected by various factors. The authors recommend that researchers clarify ICC models used and ICC values are interpreted in the context of measurement.
Keywords: Intraclass correlation coefficient; Orthopaedic research; Reliability.
Conflict of interest statement
No potential conflict of interest relevant to this article was reported.
Figures
Similar articles
-
Reliability of physical examination in the measurement of hip flexion contracture and correlation with gait parameters in cerebral palsy.J Bone Joint Surg Am. 2011 Jan 19;93(2):150-8. doi: 10.2106/JBJS.J.00252. J Bone Joint Surg Am. 2011. PMID: 21248212
-
Reliability of popliteal angle measurement: a study in cerebral palsy patients and healthy controls.J Pediatr Orthop. 2007 Sep;27(6):648-52. doi: 10.1097/BPO.0b013e3180dca15d. J Pediatr Orthop. 2007. PMID: 17717465
-
Reliability of four tests to assess body posture and the range of selected movements in individuals with spinal muscular atrophy.BMC Musculoskelet Disord. 2019 Feb 7;20(1):54. doi: 10.1186/s12891-018-2389-8. BMC Musculoskelet Disord. 2019. PMID: 30732590 Free PMC article.
-
Reliability and validity of current physical examination techniques of the foot and ankle.J Am Podiatr Med Assoc. 2008 May-Jun;98(3):197-206. doi: 10.7547/0980197. J Am Podiatr Med Assoc. 2008. PMID: 18487593 Review.
-
Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM.J Strength Cond Res. 2005 Feb;19(1):231-40. doi: 10.1519/15184.1. J Strength Cond Res. 2005. PMID: 15705040 Review.
Cited by
-
Validity, reliability and responsiveness of a goniometer watch to measure pure forearm rotation.Hand Ther. 2024 Mar;29(1):30-40. doi: 10.1177/17589983231211813. Epub 2023 Nov 1. Hand Ther. 2024. PMID: 38434187 Free PMC article.
-
Transcultural adaptation and validation of a Korea version of Pedi-IKDC questionnaire.Arch Public Health. 2024 Jan 18;82(1):10. doi: 10.1186/s13690-023-01236-7. Arch Public Health. 2024. PMID: 38238827 Free PMC article.
-
Multi-professional screening instrument for risk of broncho-aspiration in a hospital environment for the elderly population: validity evidence based on response processes.Codas. 2023 Dec 22;36(1):e20220228. doi: 10.1590/2317-1782/20232022228pt. eCollection 2023. Codas. 2023. PMID: 38126426 Free PMC article.
-
Ecological Momentary Assessment of Youth Anxiety: Evaluation of Psychometrics for Use in Clinical Trials.J Child Adolesc Psychopharmacol. 2023 Dec;33(10):409-417. doi: 10.1089/cap.2023.0025. Epub 2023 Dec 5. J Child Adolesc Psychopharmacol. 2023. PMID: 38052059
-
Improving the reliability of measurements in orthopaedics and sports medicine.Knee Surg Sports Traumatol Arthrosc. 2023 Dec;31(12):5277-5285. doi: 10.1007/s00167-023-07635-1. Epub 2023 Oct 30. Knee Surg Sports Traumatol Arthrosc. 2023. PMID: 37902842 Free PMC article.
References
-
- Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159–174. - PubMed
-
- Guyatt G, Rennie D. Users' guides to the medical literature: a manual for evidence-based clinical practice. Chicago: AMA Press; 2002.
-
- Hunt RJ. Percent agreement, Pearson's correlation, and kappa as measures of inter-examiner reliability. J Dent Res. 1986;65(2):128–130. - PubMed
-
- Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86(2):420–428. - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources
