Can mental health diagnoses in administrative data be used for research? A systematic review of the accuracy of routinely collected diagnoses

BMC Psychiatry. 2016 Jul 26;16:263. doi: 10.1186/s12888-016-0963-x.


Background: There is increasing availability of data derived from diagnoses made routinely in mental health care, and interest in using these for research. Such data will be subject to both diagnostic (clinical) error and administrative error, and so it is necessary to evaluate its accuracy against a reference-standard. Our aim was to review studies where this had been done to guide the use of other available data.

Methods: We searched PubMed and EMBASE for studies comparing routinely collected mental health diagnosis data to a reference standard. We produced diagnostic category-specific positive predictive values (PPV) and Cohen's kappa for each study.

Results: We found 39 eligible studies. Studies were heterogeneous in design, with a wide range of outcomes. Administrative error was small compared to diagnostic error. PPV was related to base rate of the respective condition, with overall median of 76 %. Kappa results on average showed a moderate agreement between source data and reference standard for most diagnostic categories (median kappa = 0.45-0.55); anxiety disorders and schizoaffective disorder showed poorer agreement. There was no significant benefit in accuracy for diagnoses made in inpatients.

Conclusions: The current evidence partly answered our questions. There was wide variation in the quality of source data, with a risk of publication bias. For some diagnoses, especially psychotic categories, administrative data were generally predictive of true diagnosis. For others, such as anxiety disorders, the data were less satisfactory. We discuss the implications of our findings, and the need for researchers to validate routine diagnostic data.

Keywords: Administrative data; Case registers; Diagnosis; Electronic health records; Hospital episode statistics; Population research; Psychiatry.

Publication types

  • Review
  • Systematic Review

MeSH terms

  • Data Accuracy*
  • Humans
  • Inpatients
  • Mental Disorders / diagnosis*
  • Predictive Value of Tests
  • Research Design / statistics & numerical data*
  • Research*