Background: Systematic reviews are used widely to guide health care decisions. Several tools have been created to assess systematic review quality. The measurement tool for assessing the methodological quality of systematic reviews known as the AMSTAR tool applies a yes/no score to eleven relevant domains of review methodology. This tool has been reworked so that each domain is scored based on a four point scale, producing R-AMSTAR.
Methods and findings: We aimed to compare the AMSTAR and R-AMSTAR tools in assessing systematic reviews in the field of assisted reproduction for subfertility. All published systematic reviews on assisted reproductive technology, with the latest search for studies taking place from 2007-2011, were considered. Reviews that contained no included studies or considered diagnostic outcomes were excluded. Thirty each of Cochrane and non-Cochrane reviews were randomly selected from a search of relevant databases. Both tools were then applied to all sixty reviews. The results were converted to percentage scores and all reviews graded and ranked based on this. AMSTAR produced a much wider variation in percentage scores and achieved higher inter-rater reliability than R-AMSTAR according to kappa statistics. The average rating for Cochrane reviews was consistent between the two tools (88.3% for R-AMSTAR versus 83.6% for AMSTAR) but inconsistent for non-Cochrane reviews (63.9% R-AMSTAR vs. 38.5% AMSTAR). In comparing the rankings generated between the two tools Cochrane reviews changed an average of 4.2 places, compared to 2.9 for non-Cochrane.
Conclusion: R-AMSTAR provided greater guidance in the assessment of domains and produced quantitative results. However, there were many problems with the construction of its criteria and AMSTAR was much easier to apply consistently. We recommend that AMSTAR incorporates the findings of this study and produces additional guidance for its application in order to improve its reliability and usefulness.