Fingerprinting Biomedical Terminologies--Automatic Classification and Visualization of Biomedical Vocabularies through UMLS Semantic Group Profiles

Stud Health Technol Inform. 2015;216:771-5.


Objectives: To explore automatic methods for the classification of biomedical vocabularies based on their content.

Methods: We create semantic group profiles for each source vocabulary in the UMLS and compare the vectors using a Euclidian distance. We explore several techniques for visualizing individual semantic group profiles and the entire distance matrix, including donut pie charts, heatmaps, dendrograms and networks.

Results: We provide donut pie charts for individual source vocavularies, as well as a heatmap, dendrogram and network for a subset of 78 vocabularies from the UMLS.

Conclusions: Our approach to fingerprinting biomedical terminologies is completely automated and can easily be applied to all source vocabularies in the UMLS, including upcoming versions of the UMLS. It supports the exploration, selection and comparison of the biomedical terminologies integrated into the UMLS. The visualizations are available at (

Publication types

  • Research Support, N.I.H., Intramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biological Ontologies*
  • Machine Learning*
  • Natural Language Processing*
  • Pattern Recognition, Automated / methods
  • Semantics*
  • Unified Medical Language System / classification*
  • User-Computer Interface*