Pitfalls and important issues in testing reliability using intraclass correlation coefficients in orthopaedic research

Clin Orthop Surg. 2012 Jun;4(2):149-55. doi: 10.4055/cios.2012.4.2.149. Epub 2012 May 17.


Background: Intra-class correlation coefficients (ICCs) provide a statistical means of testing the reliability. However, their interpretation is not well documented in the orthopedic field. The purpose of this study was to investigate the use of ICCs in the orthopedic literature and to demonstrate pitfalls regarding their use.

Methods: First, orthopedic articles that used ICCs were retrieved from the Pubmed database, and journal demography, ICC models and concurrent statistics used were evaluated. Second, reliability test was performed on three common physical examinations in cerebral palsy, namely, the Thomas test, the Staheli test, and popliteal angle measurement. Thirty patients were assessed by three orthopedic surgeons to explore the statistical methods testing reliability. Third, the factors affecting the ICC values were examined by simulating the data sets based on the physical examination data where the ranges, slopes, and interobserver variability were modified.

Results: Of the 92 orthopedic articles identified, 58 articles (63%) did not clarify the ICC model used, and only 5 articles (5%) described all models, types, and measures. In reliability testing, although the popliteal angle showed a larger mean absolute difference than the Thomas test and the Staheli test, the ICC of popliteal angle was higher, which was believed to be contrary to the context of measurement. In addition, the ICC values were affected by the model, type, and measures used. In simulated data sets, the ICC showed higher values when the range of data sets were larger, the slopes of the data sets were parallel, and the interobserver variability was smaller.

Conclusions: Care should be taken when interpreting the absolute ICC values, i.e., a higher ICC does not necessarily mean less variability because the ICC values can also be affected by various factors. The authors recommend that researchers clarify ICC models used and ICC values are interpreted in the context of measurement.

Keywords: Intraclass correlation coefficient; Orthopaedic research; Reliability.

MeSH terms

  • Adolescent
  • Biomedical Research / methods*
  • Biomedical Research / standards*
  • Cerebral Palsy
  • Child
  • Child, Preschool
  • Computer Simulation
  • Databases, Factual
  • Female
  • Humans
  • Male
  • Models, Theoretical
  • Orthopedics / methods*
  • Orthopedics / standards*
  • Physical Examination
  • Range of Motion, Articular
  • Reproducibility of Results
  • Research Design
  • Statistics as Topic
  • Young Adult