Comparison of the unstructured clinician gestalt, the wells score, and the revised Geneva score to estimate pretest probability for suspected pulmonary embolism

Ann Emerg Med. 2013 Aug;62(2):117-124.e2. doi: 10.1016/j.annemergmed.2012.11.002. Epub 2013 Feb 21.


Study objective: The assessment of clinical probability (as low, moderate, or high) with clinical decision rules has become a cornerstone of diagnostic strategy for patients with suspected pulmonary embolism, but little is known about the use of physician gestalt assessment of clinical probability. We evaluate the performance of gestalt assessment for diagnosing pulmonary embolism.

Methods: We conducted a retrospective analysis of a prospective observational cohort of consecutive suspected pulmonary embolism patients in emergency departments. Accuracy of gestalt assessment was compared with the Wells score and the revised Geneva score by the area under the curve (AUC) of receiver operating characteristic curves. Agreement between the 3 methods was determined by κ test.

Results: The study population was 1,038 patients, with a pulmonary embolism prevalence of 31.3%. AUC differed significantly between the 3 methods and was 0.81 (95% confidence interval [CI] 0.78 to 0.84) for gestalt assessment, 0.71 (95% CI 0.68 to 0.75) for Wells, and 0.66 (95% CI 0.63 to 0.70) for the revised Geneva score. The proportion of patients categorized as having low clinical probability was statistically higher with gestalt than with revised Geneva score (43% versus 26%; 95% CI for the difference of 17%=13% to 21%). Proportion of patients categorized as having high clinical probability was higher with gestalt than with Wells (24% versus 7%; 95% CI for the difference of 17%=14% to 20%) or revised Geneva score (24% versus 10%; 95% CI for the difference of 15%=13% to 21%). Pulmonary embolism prevalence was significantly lower with gestalt versus clinical decision rules in low clinical probability (7.6% for gestalt versus 13.0% for revised Geneva score and 12.6% for Wells score) and non-high clinical probability groups (18.3% for gestalt versus 29.3% for Wells and 27.4% for revised Geneva score) and was significantly higher with gestalt versus Wells score in high clinical probability groups (72.1% versus 58.1%). Agreement between the 3 methods was poor, with all κ values below 0.3.

Conclusion: In our retrospective study, gestalt assessment seems to perform better than clinical decision rules because of better selection of patients with low and high clinical probability.

Publication types

  • Comparative Study
  • Evaluation Study
  • Multicenter Study
  • Research Support, Non-U.S. Gov't
  • Comment

MeSH terms

  • Appendicitis / diagnosis*
  • Decision Support Techniques*
  • Female
  • Humans
  • Male