Objective: To evaluate the inter-rater agreement of the record review process of the Dutch Adverse Event study, which we aimed to improve by the involvement of two independent physician reviewers per record instead of one including a consensus procedure in case of disagreement.
Methods: The inter-rater agreement within pairs of physicians (independent review between physician A+B) and between pairs of physicians (independent review between physician A+B and C+D) was measured to evaluate the record review process with two physicians including a consensus procedure, with 4,272 and 119 records, respectively.
Results: The inter-rater agreement within pairs of physicians was substantial for the determination of adverse events (AEs) with a kappa of 0.64 (95% confidence interval [CI]: 0.61, 0.68). The inter-rater agreement between pairs of physicians was fair for the determination of AEs with a kappa of 0.25 (95% CI: 0.05, 0.45).
Conclusion: A record review process with two physicians per record including a consensus procedure to assess AEs is not more reliable than a record review process with one physician. Retrospective estimates of incidence of AEs from record review studies should be interpreted with caution. Improvement of the method is necessary for monitoring incidence of AEs over time at a national level.