Metabomatching: Using genetic association to identify metabolites in proton NMR spectroscopy

PLoS Comput Biol. 2017 Dec 1;13(12):e1005839. doi: 10.1371/journal.pcbi.1005839. eCollection 2017 Dec.

Abstract

A metabolome-wide genome-wide association study (mGWAS) aims to discover the effects of genetic variants on metabolome phenotypes. Most mGWASes use as phenotypes concentrations of limited sets of metabolites that can be identified and quantified from spectral information. In contrast, in an untargeted mGWAS both identification and quantification are forgone and, instead, all measured metabolome features are tested for association with genetic variants. While the untargeted approach does not discard data that may have eluded identification, the interpretation of associated features remains a challenge. To address this issue, we developed metabomatching to identify the metabolites underlying significant associations observed in untargeted mGWASes on proton NMR metabolome data. Metabomatching capitalizes on genetic spiking, the concept that because metabolome features associated with a genetic variant tend to correspond to the peaks of the NMR spectrum of the underlying metabolite, genetic association can allow for identification. Applied to the untargeted mGWASes in the SHIP and CoLaus cohorts and using 180 reference NMR spectra of the urine metabolome database, metabomatching successfully identified the underlying metabolite in 14 of 19, and 8 of 9 associations, respectively. The accuracy and efficiency of our method make it a strong contender for facilitating or complementing metabolomics analyses in large cohorts, where the availability of genetic, or other data, enables our approach, but targeted quantification is limited.

MeSH terms

  • Databases, Genetic*
  • Genome-Wide Association Study / methods*
  • Humans
  • Magnetic Resonance Spectroscopy / methods*
  • Metabolomics / methods*

Grants and funding

This work was supported by the Leenards Foundation (to ZK), the European Comission’s Horizon 2020 program via the PhenoMeNal project (654241 to SB), the Swiss Institute of Bioinformatics (to SB, to ZK), the Swiss National Science Foundation (31003A-143914 to ZK, 310030-152724 to SB) and SystemsX.ch (51RTP0-151019 to ZK). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.