Systematic evaluation and comparison of statistical tests for publication bias

J Epidemiol. 2005 Nov;15(6):235-43. doi: 10.2188/jea.15.235.


Background: This study evaluates the statistical and discriminatory powers of three statistical test methods (Begg's, Egger's, and Macaskill's) to detect publication bias in meta-analyses.

Methods: The data sources were 130 reviews from the Cochrane Database of Systematic Reviews 2002 issue, which considered a binary endpoint and contained 10 or more individual studies. Funnel plots with observers'agreements were selected as a reference standard. We evaluated a trade-off between sensitivity and specificity by varying cut-off p-values, power of statistical tests given fixed false positive rates, and area under the receiver operating characteristic curve.

Results: In 36 reviews, 733 original studies evaluated 2,874,006 subjects. The number of trials included in each ranged from 10 to 70 (median 14.5). Given that the false positive rate was 0.1, the sensitivity of Egger's method was 0.93, and was larger than that of Begg's method (0.86) and Macaskill's method (0.43). The sensitivities of three statistical tests increased as the cut-off p-values increased without a substantial decrement of specificities. The area under the ROC curve of Egger's method was 0.955 (95% confidence interval, 0.889-1.000) and was not different from that of Begg's method (area=0.913, p=0.2302), but it was larger than that of Macaskill's method (area=0.719, p=0.0116).

Conclusion: Egger's linear regression method and Begg's method had stronger statistical and discriminatory powers than Macaskill's method for detecting publication bias given the same type I error level. The power of these methods could be improved by increasing the cut-off p-value without a substantial increment of false positive rate.

Publication types

  • Comparative Study
  • Evaluation Study

MeSH terms

  • Linear Models
  • Meta-Analysis as Topic
  • Publication Bias*
  • ROC Curve