Background: Population-level health administrative datasets such as hospital discharge data are used increasingly to evaluate health services and outcomes of care. However information about the accuracy of Australian discharge data in identifying cancer, associated procedures and comorbidity is limited. The Admitted Patients Data Collection (APDC) is a census of inpatient hospital discharges in the state of New South Wales (NSW). Our aim was to assess the accuracy of the APDC in identifying upper gastro-intestinal (upper GI) cancer cases, procedures for associated curative resection and comorbidities at the time of admission compared to data abstracted from medical records (the 'gold standard').
Methods: We reviewed the medical records of 240 patients with an incident upper GI cancer diagnosis derived from a clinical database in one NSW area health service from July 2006 to June 2007. Extracted case record data was matched to APDC discharge data to determine sensitivity, positive predictive value (PPV) and agreement between the two data sources (κ-coefficient).
Results: The accuracy of the APDC diagnostic codes in identifying site-specific incident cancer ranged from 80-95% sensitivity. This was comparable to the accuracy of APDC procedure codes in identifying curative resection for upper GI cancer. PPV ranged from 42-80% for cancer diagnosis and 56-93% for curative surgery. Agreement between the data sources was >0.72 for most cancer diagnoses and curative resections. However, APDC discharge data was less accurate in reporting common comorbidities - for each condition, sensitivity ranged from 9-70%, whilst agreement ranged from κ = 0.64 for diabetes down to κ < 0.01 for gastro-oesophageal reflux disorder.
Conclusions: Identifying incident cases of upper GI cancer and curative resection from hospital administrative data is satisfactory but under-ascertained. Linkage of multiple population-health datasets is advisable to maximise case ascertainment and minimise false-positives. Consideration must be given when utilising hospital discharge data alone for generating comorbidity indices, as disease burden at the time of admission is under-reported.