Data Integration through Ontology-Based Data Access to Support Integrative Data Analysis: A Case Study of Cancer Survival

Proceedings (IEEE Int Conf Bioinformatics Biomed). 2017 Nov:2017:1300-1303. doi: 10.1109/BIBM.2017.8217849. Epub 2017 Dec 18.

Abstract

To improve cancer survival rates and prognosis, one of the first steps is to improve our understanding of contributory factors associated with cancer survival. Prior research has suggested that cancer survival is influenced by multiple factors from multiple levels. Most of existing analyses of cancer survival used data from a single source. Nevertheless, there are key challenges in integrating variables from different sources. Data integration is a daunting task because data from different sources can be heterogeneous in syntax, schema, and particularly semantics. Thus, we propose to adopt a semantic data integration approach that generates a universal conceptual representation of "information" including data and their relationships. This paper describes a case study of semantic data integration linking three data sets that cover both individual and contextual level factors for the purpose of assessing the association of the predictors of interest with cancer survival using cox proportional hazard models.

Keywords: cancer survival; integrative data analysis; ontology-based data access; semantic data integration.