Purpose: Before using computerized databases to study hepatitis C virus (HCV) epidemiology, the validity of the diagnosis must be assessed. We determined the accuracy of HCV diagnostic codes within The Health Improvement Network (THIN), an electronic database containing medical record data from general medical practices in the United Kingdom.
Methods: Patients with initial diagnostic codes for HCV infection and nonspecific viral hepatitis between 2000 and 2007 in the THIN database were identified. Questionnaires were mailed to general practitioners caring for a random sample of 150 of these patients (75 with an HCV code; 75 with a nonspecific viral hepatitis code) to collect information on HCV and other hepatitis diagnoses. We determined the positive predictive value of the database's HCV diagnostic codes and its ability to identify the date of a new HCV diagnosis.
Results: Usable surveys were returned for 146 (97%) patients. Among 74 patients with an HCV code and questionnaire data, HCV was confirmed in 64 (positive predictive value, 86%; 95%CI, 77-93%). In 40 (63%), the first recorded diagnosis in THIN was within 30 days of the date reported in the questionnaire (median difference, 11 days; interquartile range, 0-362 days). Among 72 patients with a nonspecific viral hepatitis code, 16 (22%) had HCV, but manual review of the database's electronic records correctly identified 12/16 (75%).
Conclusions: In THIN, the HCV-specific diagnostic codes are highly predictive of HCV infection. After manual review, few patients with a nonspecific viral hepatitis code were misclassified as having HCV infection.