Validating the accuracy of administrative healthcare data identifying epilepsy in deceased adults: A Scottish data linkage study

Epilepsy Res. 2020 Nov:167:106462. doi: 10.1016/j.eplepsyres.2020.106462. Epub 2020 Sep 13.


Background: We investigate the case-ascertainment accuracy for potentially active epilepsy of four administrative healthcare datasets used to identify deceased adults in Scotland.

Methods: In this diagnostic accuracy study, unique patient identifiers were used to link administrative healthcare data for adults (aged 16 years and over) who died in Scotland between 01/01/09-01/01/16. Cases were ascertained from linking mortality records, hospital admissions, antiepileptic drug (AED) prescriptions, and primary care attendances. We assessed ICD-10 codes G40 (epilepsy), G41 (status epilepticus), and R56.8 (seizures) listed as causes of death and as hospital admission reasons, various AEDs, and F25 primary care epilepsy Read codes. These epilepsy indicators were searched through 01/01/09-01/01/16, suggesting active epilepsy during a maximal period of seven years before death. They were compared to epilepsy diagnoses made from medical records reviewed by a senior epileptologist, with a second senior epileptologist independently reviewing the medical records in a 10 % sample to check for specialist interrater agreement in epilepsy diagnoses. We validated how accurately epilepsy was identified by each dataset alone and when combined, calculating positive predictive value (PPV) and sensitivity (with 95 % confidence intervals (CIs)).

Results: 159,032 deceased potential epilepsy cases were captured across the four datasets. Medical records reviewed in a random sample of 936 confirmed that epilepsy was present in 614 and absent in 322. Specialist interrater diagnostic agreement was substantial (100 medical records reviewed in duplicate, kappa = 0.72, CI 0.58-0.86). G40-41 cause of death codes had a PPV of 86 % (CI 84-89 %) and sensitivity of 73 % (CI 69-76 %). Adding R56.8 lowered PPV to 69 % (CI 65-72 %) and raised sensitivity to 87 % (CI 84-90 %). The optimal algorithm combining two datasets consisted of F25 Read codes paired with AEDs (PPV 86 % (CI 80-92 %), sensitivity 93 % (CI 88-97 %)). Also effective was pairing G40-41 and/or R56.8 cause of death codes with AEDs (PPV 91 % (CI 89-94 %), sensitivity 81 % (CI 77-84 %)). Whilst algorithms combining three datasets raised PPV to as high as 93-95 %, the associated sensitivities were low (71 % at most).

Conclusions: Routinely-collected Scottish data can accurately identify epilepsy in deceased adults. It may be necessary to combine the diagnostic coding used with AEDs to ensure optimal case-ascertainment. The results help inform the design of future Scottish epilepsy mortality studies recruiting from administrative data sources.

Keywords: Cause of death; Diagnostic accuracy study; ICD-10; Mortality; Routine data; Seizures.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Algorithms
  • Anticonvulsants / therapeutic use*
  • Databases, Factual
  • Delivery of Health Care*
  • Epilepsy / drug therapy*
  • Female
  • Humans
  • Information Storage and Retrieval
  • Male
  • Middle Aged
  • Scotland
  • Status Epilepticus / drug therapy*


  • Anticonvulsants