Objectives: To investigate three aspects of the redundancy of hierarchical relations across biomedical terminologies: 1) What proportion of the relations is redundant?, 2) Which terminologies tend to overlap with other terminologies?, and 3) Is there a link between redundancy and semantic consistency?.
Methods: Hierarchical relations are counted in the various families of terminologies integrated into the UMLS and an index of redundancy is computed for each relation. Similarity among sources is computed using the classical cosine method. Semantic consistency is evaluated by reference to the UMLS Semantic Network.
Results: Overall, 29% of the 1,128,261 relations examined exhibit redundancy. Most similar sources include consecutive versions of terminologies. The link between redundancy and semantic consistency is weak.
Discussion: Applications of these findings are discussed, including selecting sources, selecting useful relations, and auditing the categorization of UMLS concepts.