A comparison of self-report and clinical diagnostic interviews for depression: diagnostic interview schedule and schedules for clinical assessment in neuropsychiatry in the Baltimore epidemiologic catchment area follow-up

W W Eaton; K Neufeld; L S Chen; G Cai

doi:10.1001/archpsyc.57.3.217

A comparison of self-report and clinical diagnostic interviews for depression: diagnostic interview schedule and schedules for clinical assessment in neuropsychiatry in the Baltimore epidemiologic catchment area follow-up

Arch Gen Psychiatry. 2000 Mar;57(3):217-22. doi: 10.1001/archpsyc.57.3.217.

Authors

W W Eaton¹, K Neufeld, L S Chen, G Cai

Affiliation

¹ Department of Mental Hygiene, School of Hygiene and Public Health, Johns Hopkins University, Baltimore, MD 21205-1999, USA.

PMID: 10711906
DOI: 10.1001/archpsyc.57.3.217

Abstract

Background: The field of psychiatric epidemiology continues to employ self-report instruments, but the low degree of agreement between diagnoses achieved using these instruments vs. that achieved by psychiatrists in the clinical modality threatens the credibility of the results.

Methods: In the Baltimore Epidemiologic Catchment Area follow-up, 349 individuals who had a Diagnostic Interview Schedule (DIS) interview were blindly examined by psychiatrists using the Schedules for Clinical Assessment in Neuropsychiatry (SCAN). Comparisons were made at the level of diagnosis, syndrome, and DSM-IV symptom group. Indexes of agreement were computed and characteristics of discrepant cases were identified.

Results: Agreement on diagnosis of major depressive disorder was only fair (kappa = 0.20), with the DIS missing many cases judged to meet criteria for diagnosis using the SCAN (29% sensitivity). A major source of discrepancy was respondents with false-negative diagnoses who repeatedly failed to report DIS symptoms attributed to life crises or medical conditions. Older age, male sex, and lower impairment were associated with underdetection by the DIS, using logistic regression analysis. In spite of the diagnostic discrepancy, there was substantial correlation in numbers of symptom groups in the 2 modalities (r = 0.49). Agreement was highest (about 55% sensitivity and 90% specificity) when both the SCAN and DIS thresholds were set at the level of depression syndrome instead of diagnosis.

Conclusions: Weak agreement at the level of diagnosis continues to threaten the credibility of estimates of prevalence of specific disorders. A bias toward underreporting, as well as stronger agreement at the level of the depression syndrome and on ordinal measures of depressive symptoms, suggests that associations with risk factors are conservative.

Publication types

Research Support, U.S. Gov't, P.H.S.

MeSH terms

Adolescent
Adult
Baltimore / epidemiology
Catchment Area, Health
Depressive Disorder / diagnosis*
Depressive Disorder / epidemiology
False Negative Reactions
Female
Follow-Up Studies
Health Surveys*
Humans
Male
Middle Aged
Predictive Value of Tests
Prevalence
Psychiatric Status Rating Scales / statistics & numerical data*
Regression Analysis
Reproducibility of Results
Risk Factors

Grants and funding

MH47447/MH/NIMH NIH HHS/United States