Clinical practice guidelines state that the tissue source of low back pain cannot be specified in the majority of patients. However, there has been no systematic review of the accuracy of diagnostic tests used to identify the source of low back pain. The aim of this systematic review was therefore to determine the diagnostic accuracy of tests available to clinicians to identify the disc, facet joint or sacroiliac joint (SIJ) as the source of low back pain. MEDLINE, EMBASE and CINAHL were searched up to February 2006 with citation tracking of eligible studies. Eligible studies compared index tests with an appropriate reference test (discography, facet joint or SIJ blocks or medial branch blocks) in patients with low back pain. Positive likelihood ratios (+LR) > 2 or negative likelihood ratios (-LR) < 0.5 were considered informative. Forty-one studies of moderate quality were included; 28 investigated the disc, 8 the facet joint and 7 the SIJ. Various features observed on MRI (high intensity zone, endplate changes and disc degeneration) produced informative +LR (> 2) in the majority of studies increasing the probability of the disc being the low back pain source. However, heterogeneity of the data prevented pooling. +LR ranged from 1.5 to 5.9, 1.6 to 4.0, and 0.6 to 5.9 for high intensity zone, disc degeneration and endplate changes, respectively. Centralisation was the only clinical feature found to increase the likelihood of the disc as the source of pain: +LR = 2.8 (95%CI 1.4-5.3). Absence of degeneration on MRI was the only test found to reduce the likelihood of the disc as the source of pain: -LR = 0.21 (95%CI 0.12-0.35). While single manual tests of the SIJ were uninformative, their use in combination was informative with +LR of 3.2 (95%CI 2.3-4.4) and -LR of 0.29 (95%CI 0.12-0.35). None of the tests for facet joint pain were found to be informative. The results of this review demonstrate that tests do exist that change the probability of the disc or SIJ (but not the facet joint) as the source of low back pain. However, the changes in probability are usually small and at best moderate. The usefulness of these tests in clinical practice, particularly for guiding treatment selection, remains unclear.