Study design: Reliability study of guidelines development.
Objective: To compare criteria for low back surgery between two expert panels.
Background: Reliability of expert panels for determining appropriateness of indications for surgical procedures has heretofore received little attention.
Methods: Two multidisciplinary expert panels of similar composition were convened, in the United States and in Switzerland, to evaluate the appropriateness of 720 distinct clinical scenarios involving sciatica. Each indication was assigned to a category of appropriate, uncertain, and inappropriate. The appropriateness of the 720 theoretical scenarios were compared between the two panels, and both sets of criteria were applied to two series of actual cases.
Results: Seventy-nine percent (n = 566) of the 720 theoretical indications were assigned to identical categories of appropriateness by both panels (kappa = 0.63; P < 0.001). Only 2 of the 720 scenarios elicited frank disagreement. The percentage of the 720 indications that were considered appropriate differed between the two panels (U.S.: 3%; Swiss: 11%, P < 0.001), as did the percentage of intrapanel agreement for indications (U.S.: 51%, Swiss: 64%, P < 0.001). When the same theoretical scenarios were matched with two series of actual cases (n = 181 and 149) agreement was moderate (kappa = 0.46) to fair (kappa = 0.30).
Conclusion: There was substantial agreement on the appropriateness of surgery for theoretical cases of sciatica between independent expert panels from two countries. A better understanding of discordant ratings, especially for actual cases, should precede attempts at transposing recommendations emanating from a panel in one country to another.