Testing the risk of bias tool showed low reliability between individual reviewers and across consensus assessments of reviewer pairs

J Clin Epidemiol. 2013 Sep;66(9):973-81. doi: 10.1016/j.jclinepi.2012.07.005. Epub 2012 Sep 13.


Objectives: To assess the reliability of the Cochrane Risk of Bias (ROB) tool between individual raters and across consensus agreements of pairs of reviewers and examine the impact of study-level factors on reliability.

Study design and setting: Two reviewers assessed risk of bias for 154 randomized controlled trials (RCTs). For 30 RCTs, two reviewers from each of four centers assessed risk of bias and reached consensus. We assessed interrater agreement using kappas and the impact of study-level factors through subgroup analyses.

Results: Reliability between two reviewers was fair for most domains (κ=0.24-0.37), except sequence generation (κ=0.79, substantial). Reliability results across reviewer pairs: sequence generation, moderate (κ=0.60); allocation concealment and "other sources of bias," fair (κ=0.37-0.27); and other domains, slight (κ=0.05-0.09). Reliability was influenced by the nature of the outcome, nature of the intervention, study design, trial hypothesis, and funding source. Variability resulted from different interpretation of the tool rather than different information identified in the study reports.

Conclusion: Low agreement has implications for interpreting systematic reviews. These findings suggest the need for detailed guidance in assessing the risk of bias.

Keywords: Internal validity; Meta-Analysis; Randomized controlled trials; Reliability; Risk of bias; Systematic reviews.

MeSH terms

  • Bias*
  • Consensus
  • Humans
  • Observer Variation
  • Randomized Controlled Trials as Topic
  • Reproducibility of Results
  • Research Design
  • Review Literature as Topic*
  • Risk Assessment