Large-scale, Exhaustive Lattice-based Structural Auditing of SNOMED CT

AMIA Annu Symp Proc. 2010 Nov 13:2010:922-6.

Abstract

One criterion for the well-formedness of ontologies is that their hierarchical structure forms a lattice. Formal Concept Analysis (FCA) has been used as a technique for assessing the quality of ontologies, but is not scalable to large ontologies such as SNOMED CT (> 300k concepts). We developed a methodology called Lattice-based Structural Auditing (LaSA), for auditing biomedical ontologies, implemented through automated SPARQL queries, in order to exhaustively identify all non-lattice pairs in SNOMED CT. The percentage of non-lattice pairs ranges from 0 to 1.66 among the 19 SNOMED CT hierarchies. Preliminary manual inspection of a limited portion of the over 544k non-lattice pairs, among over 356 million candidate pairs, revealed inconsistent use of precoordination in SNOMED CT, but also a number of false positives. Our results are consistent with those based on FCA, with the advantage that the LaSA pipeline is scalable and applicable to ontological systems consisting mostly of taxonomic links.

MeSH terms

  • Biological Ontologies*
  • Systematized Nomenclature of Medicine*