Do ultra-short screening instruments accurately detect depression in primary care? A pooled analysis and meta-analysis of 22 studies

Br J Gen Pract. 2007 Feb;57(535):144-51.

Abstract

Background: Guidance from the National Institute for Health and Clinical Excellence recommends one or two questions as a possible screening method for depression. Ultra-short (one-, two-, three- or four-item) tests have appeal due to their simple administration but their accuracy has not been established.

Aim: To determine whether ultra-short screening instruments accurately detect depression in primary care.

Design of study: Pooled analysis and meta analysis.

Method: A literature search revealed 75 possible studies and from these, 22 STARD-compliant studies (Standards for Reporting of Diagnostic Accuracy) involving ultra-short tests were entered in the analysis.

Results: Meta-analysis revealed a performance accuracy better than chance (P<0.001). More usefully for clinicians, pooled analysis of single-question tests revealed an overall sensitivity of 32.0% and specificity of 97.0% (positive predictive value [PPV] was 55.6% and negative predictive value [NPV] was 92.3%). For two- and three-item tests, overall sensitivity on pooled analysis was 73.7% and specificity was 74.7% with a PPV of only 38.3% but a pooled NPV of 93.0%. The Youden index for single-item and multiple item tests was 0.289 and 0.47 respectively, suggesting superiority of multiple item tests. Re-analysis examining only 'either or' strategies improved the 'rule in' ability of two- and three-question tests (sensitivity 79.4% and NPV 94.7%) but at the expense of being able to rule out a possible diagnosis if the result was negative.

Conclusion: A one-question test identifies only three out of every 10 patients with depression in primary care, thus unacceptable if relied on alone. Ultra-short two- or three-question tests perform better, identifying eight out of 10 cases. This is at the expense of a high false-positive rate (only four out of 10 cases with a positive score are actually depressed). Ultra-short tests appear to be, at best, a method for ruling out a diagnosis and should only be used when there are sufficient resources for second-stage assessment of those who screen positive.

Publication types

  • Meta-Analysis

MeSH terms

  • Depressive Disorder / diagnosis*
  • Family Practice
  • Humans
  • Psychiatric Status Rating Scales / standards*
  • Sensitivity and Specificity