Background: The reliability and validity of Axis II diagnoses were investigated in a sample of 108 patients with nonpsychotic Axis I disorders.
Methods: Patients were assessed for personality disorders (PDs) with either the Personality Disorder Examination (PDE) or the Structured Interview for DSM-III-R Personality (SIDP-R). Validity was examined by comparing interview diagnoses with "best-estimate" consensus diagnoses assigned by a panel of judges.
Results: Interrater reliabilities were excellent when using continuous data (eg, total or cluster scores; intra-class correlation coefficients, .82 to .92); they were lower with categorical diagnoses (eg, any PD vs no PD; kappa = 0.55 and 0.58 with the two interviews). Validity coefficients (ie, kappa values reflecting agreement between the interviews and the consensus diagnosis) for the decision of any PD vs no PD were 0.18 (56% agreement) with the PDE and 0.37 (75% agreement) with the SIDP-R; validity coefficients for identifying cases of "marked" PD were 0.21 (62% agreement) with the PDE and 0.24 (60% agreement) with the SIDP-R.
Conclusions: There have been important advances in the development of structured interviews for Axis II diagnoses, but the findings suggest a continued need to be thoughtful about their strengths and weaknesses before accepting their results as definitive diagnostic tests. The findings also demonstrated some of the advantages of continuous vs categorical data.