What evidence is there for a delay in diagnostic coding of RA in UK general practice records? An observational study of free text

BMJ Open. 2016 Jun 28;6(6):e010393. doi: 10.1136/bmjopen-2015-010393.

Abstract

Objectives: Much research with electronic health records (EHRs) uses coded or structured data only; important information captured in the free text remains unused. One dimension of EHR data quality assessment is 'currency' or timeliness, that is, data are representative of the patient state at the time of measurement. We explored the use of free text in UK general practice patient records to evaluate delays in recording of rheumatoid arthritis (RA) diagnosis. We also aimed to locate and quantify disease and diagnostic information recorded only in text.

Setting: UK general practice patient records from the Clinical Practice Research Datalink.

Participants: 294 individuals with incident diagnosis of RA between 2005 and 2008; 204 women and 85 men, median age 63 years.

Primary and secondary outcome measures: Assessment of (1) quantity and timing of text entries for disease-modifying antirheumatic drugs (DMARDs) as a proxy for the RA disease code, and (2) quantity, location and timing of free text information relating to RA onset and diagnosis.

Results: Inflammatory markers, pain and DMARDs were the most common categories of disease information in text prior to RA diagnostic code; 10-37% of patients had such information only in text. Read codes associated with RA-related text included correspondence, general consultation and arthritis codes. 64 patients (22%) had DMARD text entries >14 days prior to RA code; these patients had more and earlier referrals to rheumatology, tests, swelling, pain and DMARD prescriptions, suggestive of an earlier implicit diagnosis than was recorded by the diagnostic code.

Conclusions: RA-related symptoms, tests, referrals and prescriptions were recorded in free text with 22% of patients showing strong evidence of delay in coding of diagnosis. Researchers using EHRs may need to mitigate for delayed codes by incorporating text into their case-ascertainment strategies. Natural language processing techniques have the capability to do this at scale.

Keywords: Rheumatoid arthritis; data quality; electronic health records; free text; general practice.

Publication types

  • Observational Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Antirheumatic Agents / therapeutic use
  • Arthritis, Rheumatoid / diagnosis*
  • Arthritis, Rheumatoid / drug therapy
  • Clinical Coding*
  • Delayed Diagnosis*
  • Electronic Health Records
  • Female
  • General Practice
  • Humans
  • Male
  • Middle Aged
  • Time-to-Treatment / statistics & numerical data*
  • United Kingdom

Substances

  • Antirheumatic Agents