In the biomedical domain, there exist a number of common data models (CDM) that have experienced wide uptake. However, none of these has emerged as the common model. Recently, the demand for integrating and analyzing increasingly large data sets in clinical and translational research has led to numerous efforts to harmonize existing CDMs and integrate data curated based on those models. These efforts raise the question of how to appropriately represent the semantics of data, and, furthermore, they highlight the fact that quite often different groups have greatly different definitions of 'semantics'. The question of how to formally assure that mappings between CDMs are correct is often overlooked. The answer to these challenges lies in using axiomatically-rich ontologies that allow verifying that terms refer to the same set of entities using automatic inference. This verification is only possible by building ontologies that represent the content of the scientific disciplines in accordance with the reality of the domain of the disciplines. Organizing and managing the development of numerous orthogonal domain-specific ontologies would benefit from using an Architecture Reference Model, that helps keeping the relationships consistent within each domain and ensure that appropriate inter-domain relationships are defined. This paper will explore how a strong logical representation of the scientific domain does not only foster harmonization of CDMs, but also informs and facilitates the transition from data over information to knowledge.
Keywords: Common Data Models; Ontologies; Semantic Interoperability; Semantics.