A multi-source data integration approach reveals novel associations between metabolites and renal outcomes in the German Chronic Kidney Disease study

Sci Rep. 2019 Sep 27;9(1):13954. doi: 10.1038/s41598-019-50346-2.


Omics data facilitate the gain of novel insights into the pathophysiology of diseases and, consequently, their diagnosis, treatment, and prevention. To this end, omics data are integrated with other data types, e.g., clinical, phenotypic, and demographic parameters of categorical or continuous nature. We exemplify this data integration issue for a chronic kidney disease (CKD) study, comprising complex clinical, demographic, and one-dimensional 1H nuclear magnetic resonance metabolic variables. Routine analysis screens for associations of single metabolic features with clinical parameters while accounting for confounders typically chosen by expert knowledge. This knowledge can be incomplete or unavailable. We introduce a framework for data integration that intrinsically adjusts for confounding variables. We give its mathematical and algorithmic foundation, provide a state-of-the-art implementation, and evaluate its performance by sanity checks and predictive performance assessment on independent test data. Particularly, we show that discovered associations remain significant after variable adjustment based on expert knowledge. In contrast, we illustrate that associations discovered in routine univariate screening approaches can be biased by incorrect or incomplete expert knowledge. Our data integration approach reveals important associations between CKD comorbidities and metabolites, including novel associations of the plasma metabolite trimethylamine-N-oxide with cardiac arrhythmia and infarction in CKD stage 3 patients.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biomarkers / blood
  • Female
  • Germany
  • Humans
  • Kidney / metabolism*
  • Magnetic Resonance Spectroscopy
  • Male
  • Metabolomics*
  • Models, Theoretical
  • Prognosis
  • Renal Insufficiency, Chronic / metabolism*


  • Biomarkers