A scale-free network view of the UMLS to learn terminology translations

Stud Health Technol Inform. 2007;129(Pt 1):689-93.

Abstract

The UMLS Metathesaurus belongs to the class of scale-free networks with few concept hubs possessing a large number of relationships. The hubs provide useful links between the concepts from disparate terminologies in the UMLS; however, they also exponentially increase the number of possible transitive cross-terminology paths. Towards the goal of using machine learning to rank cross-terminology translations, we propose a traversal algorithm that exploits the scale-free property of the UMLS to reduce the number of candidate translations. We characterize the concept hubs into "informational" and "noisy" concept hubs and provide an automated method to detect them. Using gold standard mappings from SNOMED-CT to ICD9CM, we found an average 20-fold reduction in the number of candidate mappings while achieving comparable recall and ranking results. A hub-driven traversal strategy provides a promising approach to generate high quality cross-terminology translations from the UMLS.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Artificial Intelligence
  • International Classification of Diseases
  • Systematized Nomenclature of Medicine
  • Unified Medical Language System*
  • Vocabulary, Controlled