The interrater reliability of 4 clinical tests used to assess individuals with musculoskeletal hip pain

J Orthop Sports Phys Ther. 2008 Feb;38(2):71-7. doi: 10.2519/jospt.2008.2677. Epub 2007 Sep 21.


Study design: Descriptive and reliability study.

Objectives: To evaluate the interrater reliability of the FABER test, flexion-internal rotation-adduction impingement test, log roll test, and the palpation of the greater trochanter for tenderness.

Background: Clinical examination for individuals with musculoskeletal hip pain is believed to provide critical diagnostic information. However, there is very limited information in the literature on the reproducibility of examination techniques for the hip region.

Methods and measures: Seventy subjects were evaluated prospectively by an orthopaedic surgeon and physical therapist. Subjects had a mean age of 42 years (range 18-76 years; SD 15.4) and included 32 (46%) females and 38 (54%) males. Subject diagnoses were as follows: degenerative joint disease (n=27 [39% of subjects]), labral tear (n=35 [50% of subjects]), femoroacetabular impingement (n=48 [69% of subjects]), capsular laxity (n=28 [40% of subjects]), trochanteric bursitis (n=29 [41% of subjects]), iliopsoas tendonitis (n=10 [14% of subjects]), and adductor strain (n=2 [3% of subjects)]. Subjects could have more than 1 diagnosis. Kappa, prevalence indexes, bias indexes, and maximal attainable kappa were calculated.

Results: Kappa (kappa) coefficients with 95% confidence intervals (CI) were as follows: FABER test kappa was 0.63 (95% CI: 0.43-0.83); flexion-internal rotation-adduction impingement test kappa was 0.58 (95% CI: 0.29-0.87); log roll test kappa was 0.61 (95% CI: 0.41-0.81); and greater trochanteric tenderness kappa was 0.66 (95% CI: 0.48-0.84). Bias indexes were low (0.06-0.08) for all 4 tests while prevalence indexes were low (0.03-0.37) for 3 of the 4 tests. The flexion-internal rotation-adduction impingement test had a high prevalence index (0.76), with a higher proportion of positive tests.

Conclusion: The kappa values for the FABER test, log roll test, and assessment of greater trochanteric tenderness were greater than 0.40 (fair level of agreement) at a 95% confidence level. The low reliability obtained for the flexion-internal rotation-adduction impingement test may be related to a prevalence concern.

Publication types

  • Validation Study

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Arthralgia / diagnosis*
  • Arthralgia / pathology
  • Diagnostic Tests, Routine
  • Female
  • Femur / pathology*
  • Health Status Indicators
  • Hip Joint / pathology*
  • Humans
  • Male
  • Middle Aged
  • Musculoskeletal Diseases / diagnosis*
  • Musculoskeletal Diseases / pathology
  • Prospective Studies
  • Reproducibility of Results
  • Risk Factors