A systematic review was conducted to determine inter-examiner reliability of passive assessment of segmental intervertebral motion in the cervical and lumbar spine as well as to explore sources of heterogeneity. Passive assessment of motion is used to decide on treatments for neck and low-back pain patients. Inter-examiner reliability has been a matter of debate, resulting in questions about professional credibility and accountability. A structured search for relevant studies in MEDLINE and CINAHL was followed by extensive reference tracing and hand searching. Studies presenting estimates of reliability for individual motion segments were included. No language restrictions were imposed. Study quality was assessed using criteria derived from the Standards for Reporting of Diagnostic Accuracy (STARD) statement and a quality assessment tool for studies of diagnostic accuracy included in systematic reviews (QUADAS). Study selection, quality assessment, and data extraction were performed by two reviewers independently. Qualitative analyses and additional subgroup analyses were conducted. Nineteen studies were included. Two studies satisfied criteria for external and internal validity, of which one found fair to moderate reliability. Assessment of motion segments C1-C2 and C2-C3 almost consistently reached at least fair reliability. Overall, inter-examiner reliability was poor to fair. However, most studies were found to be of poor methodological quality. We propose explicit recommendations for the conduct and reporting of future research.