Validation of The Health Improvement Network (THIN) database for epidemiologic studies of chronic kidney disease

Pharmacoepidemiol Drug Saf. 2011 Nov;20(11):1138-49. doi: 10.1002/pds.2203. Epub 2011 Aug 24.


Purpose: Chronic kidney disease (CKD) is a prevalent and important outcome and covariate in pharmacoepidemiology. The Health Improvement Network (THIN) in the UK represents a unique resource for population-based studies of CKD. We compiled a valid list of Read codes to identify subjects with moderate to advanced CKD.

Methods: A cross-sectional validation study was performed to identify codes that best define CKD Stages 3-5. All subjects with at least one non-zero measure of serum creatinine after 1 January 2002 were included. Estimated glomerular filtration rate (eGFR) was calculated according to the Schwartz formula for subjects aged < 18 years and the Modification of Diet in Renal Disease formula for subjects aged ≥ 18 years. CKD was defined as an eGFR <60 mL/minute/1.73 m² on at least two occasions, more than 90 days apart.

Results: The laboratory definition identified 230,426 subjects with CKD, for a period prevalence in 2008 of 4.56% (95%CI, 4.54-4.58). A list of 45 Read codes was compiled, which yielded a positive predictive value of 88.9% (95%CI, 88.7-89.1), sensitivity of 48.8%, negative predictive value of 86.5%, and specificity of 98.2%. Of the 11.1% of subjects with a code who did not meet the laboratory definition, 83.6% had at least one eGFR <60. The most commonly used code was for CKD Stage 3.

Conclusions: The proposed list of codes can be used to accurately identify CKD when serum creatinine data are limited. The most sensitive approach for the detection of CKD is to use this list to supplement creatinine measures.

Publication types

  • Research Support, N.I.H., Extramural
  • Validation Study

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Child
  • Child, Preschool
  • Chronic Disease / epidemiology
  • Clinical Coding / statistics & numerical data*
  • Computers
  • Creatinine* / blood
  • Cross-Sectional Studies / statistics & numerical data
  • Databases, Factual*
  • Epidemiologic Studies
  • Female
  • Glomerular Filtration Rate
  • Humans
  • Kidney Failure, Chronic / classification
  • Kidney Failure, Chronic / diagnosis
  • Kidney Failure, Chronic / epidemiology*
  • Male
  • Middle Aged
  • Predictive Value of Tests
  • Renal Insufficiency, Chronic / classification
  • Renal Insufficiency, Chronic / diagnosis
  • Renal Insufficiency, Chronic / epidemiology*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Software
  • Young Adult


  • Creatinine