A validation study of a new classification algorithm to identify rheumatoid arthritis using administrative health databases: case-control and cohort diagnostic accuracy studies. Results from the RECord linkage On Rheumatic Diseases study of the Italian Society for Rheumatology

BMJ Open. 2015 Jan 28;5(1):e006029. doi: 10.1136/bmjopen-2014-006029.

Abstract

Objectives: To develop and validate a new algorithm to identify patients with rheumatoid arthritis (RA) and estimate disease prevalence using administrative health databases (AHDs) of the Italian Lombardy region.

Design: Case-control and cohort diagnostic accuracy study.

Methods: In a randomly selected sample of 827 patients drawn from a tertiary rheumatology centre (training set), clinically validated diagnoses were linked to administrative data including diagnostic codes and drug prescriptions. An algorithm in steps of decreasing specificity was developed and its accuracy assessed calculating sensitivity/specificity, positive predictive value (PPV)/negative predictive value, with corresponding CIs. The algorithm was applied to two validating sets: 106 patients from a secondary rheumatology centre and 6087 participants from the primary care. Alternative algorithms were developed to increase PPV at population level. Crude and adjusted prevalence estimates taking into account algorithm misclassification rates were obtained for the Lombardy region.

Results: The algorithms included: RA certification by a rheumatologist, certification for other autoimmune diseases by specialists, RA code in the hospital discharge form, prescription of disease-modifying antirheumatic drugs and oral glucocorticoids. In the training set, a four-step algorithm identified clinically diagnosed RA cases with a sensitivity of 96.3 (95% CI 93.6 to 98.2) and a specificity of 90.3 (87.4 to 92.7). Both external validations showed highly consistent results. More specific algorithms achieved >80% PPV at the population level. The crude RA prevalence in Lombardy was 0.52%, and estimates adjusted for misclassification ranged from 0.31% (95% CI 0.14% to 0.42%) to 0.37% (0.25% to 0.47%).

Conclusions: AHDs are valuable tools for the identification of RA cases at the population level, and allow estimation of disease prevalence and to select retrospective cohorts.

Keywords: EPIDEMIOLOGY; PUBLIC HEALTH.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Algorithms*
  • Arthritis, Rheumatoid / diagnosis*
  • Arthritis, Rheumatoid / epidemiology
  • Case-Control Studies
  • Cohort Studies
  • Databases, Factual / statistics & numerical data*
  • Female
  • Humans
  • Italy / epidemiology
  • Male
  • Medical Record Linkage
  • Medical Records Systems, Computerized / statistics & numerical data*
  • Middle Aged
  • Prevalence
  • Rheumatology / statistics & numerical data
  • Sensitivity and Specificity