Background: Helicobacter pylori (H pylori) infection has been implicated in a number of malignancies and non-malignant conditions including peptic ulcers, non-ulcer dyspepsia, recurrent peptic ulcer bleeding, unexplained iron deficiency anaemia, idiopathic thrombocytopaenia purpura, and colorectal adenomas. The confirmatory diagnosis of H pylori is by endoscopic biopsy, followed by histopathological examination using haemotoxylin and eosin (H & E) stain or special stains such as Giemsa stain and Warthin-Starry stain. Special stains are more accurate than H & E stain. There is significant uncertainty about the diagnostic accuracy of non-invasive tests for diagnosis of H pylori.
Objectives: To compare the diagnostic accuracy of urea breath test, serology, and stool antigen test, used alone or in combination, for diagnosis of H pylori infection in symptomatic and asymptomatic people, so that eradication therapy for H pylori can be started.
Search methods: We searched MEDLINE, Embase, the Science Citation Index and the National Institute for Health Research Health Technology Assessment Database on 4 March 2016. We screened references in the included studies to identify additional studies. We also conducted citation searches of relevant studies, most recently on 4 December 2016. We did not restrict studies by language or publication status, or whether data were collected prospectively or retrospectively.
Selection criteria: We included diagnostic accuracy studies that evaluated at least one of the index tests (urea breath test using isotopes such as 13C or 14C, serology and stool antigen test) against the reference standard (histopathological examination using H & E stain, special stains or immunohistochemical stain) in people suspected of having H pylori infection.
Data collection and analysis: Two review authors independently screened the references to identify relevant studies and independently extracted data. We assessed the methodological quality of studies using the QUADAS-2 tool. We performed meta-analysis by using the hierarchical summary receiver operating characteristic (HSROC) model to estimate and compare SROC curves. Where appropriate, we used bivariate or univariate logistic regression models to estimate summary sensitivities and specificities.
Main results: We included 101 studies involving 11,003 participants, of which 5839 participants (53.1%) had H pylori infection. The prevalence of H pylori infection in the studies ranged from 15.2% to 94.7%, with a median prevalence of 53.7% (interquartile range 42.0% to 66.5%). Most of the studies (57%) included participants with dyspepsia and 53 studies excluded participants who recently had proton pump inhibitors or antibiotics.There was at least an unclear risk of bias or unclear applicability concern for each study.Of the 101 studies, 15 compared the accuracy of two index tests and two studies compared the accuracy of three index tests. Thirty-four studies (4242 participants) evaluated serology; 29 studies (2988 participants) evaluated stool antigen test; 34 studies (3139 participants) evaluated urea breath test-13C; 21 studies (1810 participants) evaluated urea breath test-14C; and two studies (127 participants) evaluated urea breath test but did not report the isotope used. The thresholds used to define test positivity and the staining techniques used for histopathological examination (reference standard) varied between studies. Due to sparse data for each threshold reported, it was not possible to identify the best threshold for each test.Using data from 99 studies in an indirect test comparison, there was statistical evidence of a difference in diagnostic accuracy between urea breath test-13C, urea breath test-14C, serology and stool antigen test (P = 0.024). The diagnostic odds ratios for urea breath test-13C, urea breath test-14C, serology, and stool antigen test were 153 (95% confidence interval (CI) 73.7 to 316), 105 (95% CI 74.0 to 150), 47.4 (95% CI 25.5 to 88.1) and 45.1 (95% CI 24.2 to 84.1). The sensitivity (95% CI) estimated at a fixed specificity of 0.90 (median from studies across the four tests), was 0.94 (95% CI 0.89 to 0.97) for urea breath test-13C, 0.92 (95% CI 0.89 to 0.94) for urea breath test-14C, 0.84 (95% CI 0.74 to 0.91) for serology, and 0.83 (95% CI 0.73 to 0.90) for stool antigen test. This implies that on average, given a specificity of 0.90 and prevalence of 53.7% (median specificity and prevalence in the studies), out of 1000 people tested for H pylori infection, there will be 46 false positives (people without H pylori infection who will be diagnosed as having H pylori infection). In this hypothetical cohort, urea breath test-13C, urea breath test-14C, serology, and stool antigen test will give 30 (95% CI 15 to 58), 42 (95% CI 30 to 58), 86 (95% CI 50 to 140), and 89 (95% CI 52 to 146) false negatives respectively (people with H pylori infection for whom the diagnosis of H pylori will be missed).Direct comparisons were based on few head-to-head studies. The ratios of diagnostic odds ratios (DORs) were 0.68 (95% CI 0.12 to 3.70; P = 0.56) for urea breath test-13C versus serology (seven studies), and 0.88 (95% CI 0.14 to 5.56; P = 0.84) for urea breath test-13C versus stool antigen test (seven studies). The 95% CIs of these estimates overlap with those of the ratios of DORs from the indirect comparison. Data were limited or unavailable for meta-analysis of other direct comparisons.
Authors' conclusions: In people without a history of gastrectomy and those who have not recently had antibiotics or proton ,pump inhibitors, urea breath tests had high diagnostic accuracy while serology and stool antigen tests were less accurate for diagnosis of Helicobacter pylori infection.This is based on an indirect test comparison (with potential for bias due to confounding), as evidence from direct comparisons was limited or unavailable. The thresholds used for these tests were highly variable and we were unable to identify specific thresholds that might be useful in clinical practice.We need further comparative studies of high methodological quality to obtain more reliable evidence of relative accuracy between the tests. Such studies should be conducted prospectively in a representative spectrum of participants and clearly reported to ensure low risk of bias. Most importantly, studies should prespecify and clearly report thresholds used, and should avoid inappropriate exclusions.