Coverage of rare disease names in standard terminologies and implications for patients, providers, and research

AMIA Annu Symp Proc. 2014 Nov 14;2014:564-72. eCollection 2014.


Small numbers of patients are a special challenge for rare diseases research. Electronic health record (EHR) data can facilitate research if patients with rare diseases can be reliably identified. We estimate the coverage of the names of a set of 6,519 rare diseases. Using the UMLS, 697 (11%) diseases were matched to ICD-9-CM, 1,386 (21%) to ICD-10-CM and 2,848 (44%) to SNOMED CT. Using published mappings from SNOMED CT to ICD, we further estimate additional broader matches of 2,569 (39%) rare diseases to ICD-9-CM and 1,635 (25%) to ICD-10-CM. The number of codes that match one and only one disease are 1,081 (62%) for ICD-9-CM, 1,403 (73%) for ICD-10-CM, and 3,311 (85%) for SNOMED CT. Our findings confirm that SNOMED CT has the greatest coverage and specificity needed to identify patients with a rare disease from EHR-data, and can facilitate research and evidence-based care.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Biomedical Research
  • Electronic Health Records
  • Humans
  • International Classification of Diseases*
  • Names
  • Rare Diseases* / classification
  • Systematized Nomenclature of Medicine*
  • Unified Medical Language System