The accuracy of Patient Health Questionnaire-9 in detecting depression and measuring depression severity in high-risk groups in primary care

Gen Hosp Psychiatry. 2009 Sep-Oct;31(5):451-9. doi: 10.1016/j.genhosppsych.2009.06.001. Epub 2009 Jul 10.


Objective: Only half of patients with depressive disorder are diagnosed by their family physicians. Screening in high-risk groups might reduce this hidden morbidity. This study aims to determine the accuracy of the Patient Health Questionnaire-9 (PHQ-9) in (a) screening for depressive disorder, (b) diagnosing depressive disorder and (c) measuring the severity of depressive disorder in groups that are at high risk for depressive disorder.

Method: We compared the performance of the PHQ-9 as a screening instrument and as a diagnostic instrument to that of the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID-I) interview, which we used as reference standard. Three high-risk groups of patients were selected: (a) frequent attenders, (b) patients with mental health problems and (c) patients with unexplained complaints. Patients completed the PHQ-9. Next, patients who were at risk for depression (based on PHQ scores) and a random sample of 20% of patients who were not at risk were selected for a second PHQ-9 and the reference standard (SCID-I). We assessed the adequacy of the PHQ-9 as a tool for severity measurement by comparing PHQ-9 scores with scores on the 17-item Hamilton Depression Rating Scale (HDRS-17) in patients diagnosed with a depressive disorder.

Results: Among 440 patients, both PHQ-9 and SCID-I were analyzed. The test characteristics for screening were sensitivity=0.93 and specificity=0.85; those for diagnosing were sensitivity=0.68 and specificity=0.95. The positive likelihood ratio for diagnosing was 14.2. The HDRS-17 was administered in 49 patients with depressive disorder. The Pearson correlation coefficient of the PHQ-9 to the HDRS-17 was r=.52 (P<.01).

Conclusion: The PHQ-9 performs well as a screening instrument, but in diagnosing depressive disorder, a formal diagnostic process following the PHQ-9 remains imperative. The PHQ-9 does not seem adequate for measuring severity.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Depression / diagnosis*
  • Depression / epidemiology
  • Female
  • Humans
  • Male
  • Mass Screening
  • Middle Aged
  • Netherlands / epidemiology
  • Primary Health Care*
  • Sensitivity and Specificity
  • Severity of Illness Index*
  • Surveys and Questionnaires / standards*