Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar 15;34(6):1009-1015.
doi: 10.1093/bioinformatics/btx682.

Unsupervised multiple kernel learning for heterogeneous data integration

Affiliations

Unsupervised multiple kernel learning for heterogeneous data integration

Jérôme Mariette et al. Bioinformatics. .

Abstract

Motivation: Recent high-throughput sequencing advances have expanded the breadth of available omics datasets and the integrated analysis of multiple datasets obtained on the same samples has allowed to gain important insights in a wide range of applications. However, the integration of various sources of information remains a challenge for systems biology since produced datasets are often of heterogeneous types, with the need of developing generic methods to take their different specificities into account.

Results: We propose a multiple kernel framework that allows to integrate multiple datasets of various types into a single exploratory analysis. Several solutions are provided to learn either a consensus meta-kernel or a meta-kernel that preserves the original topology of the datasets. We applied our framework to analyse two public multi-omics datasets. First, the multiple metagenomic datasets, collected during the TARA Oceans expedition, was explored to demonstrate that our method is able to retrieve previous findings in a single kernel PCA as well as to provide a new image of the sample structures when a larger number of datasets are included in the analysis. To perform this analysis, a generic procedure is also proposed to improve the interpretability of the kernel PCA in regards with the original data. Second, the multi-omics breast cancer datasets, provided by The Cancer Genome Atlas, is analysed using a kernel Self-Organizing Maps with both single and multi-omics strategies. The comparison of these two approaches demonstrates the benefit of our integration method to improve the representation of the studied biological system.

Availability and implementation: Proposed methods are available in the R package mixKernel, released on CRAN. It is fully compatible with the mixOmics package and a tutorial describing the approach can be found on mixOmics web site http://mixomics.org/mixkernel/.

Contact: jerome.mariette@inra.fr or nathalie.villa-vialaneix@inra.fr.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Similar articles

Cited by

  • MEMMAL: A tool for expanding large-scale mechanistic models with machine learned associations and big datasets.
    Erdem C, Birtwistle MR. Erdem C, et al. Front Syst Biol. 2023;3:1099413. doi: 10.3389/fsysb.2023.1099413. Epub 2023 Mar 9. Front Syst Biol. 2023. PMID: 38269333 Free PMC article.
  • A toolbox of machine learning software to support microbiome analysis.
    Marcos-Zambrano LJ, López-Molina VM, Bakir-Gungor B, Frohme M, Karaduzovic-Hadziabdic K, Klammsteiner T, Ibrahimi E, Lahti L, Loncar-Turukalo T, Dhamo X, Simeon A, Nechyporenko A, Pio G, Przymus P, Sampri A, Trajkovik V, Lacruz-Pleguezuelos B, Aasmets O, Araujo R, Anagnostopoulos I, Aydemir Ö, Berland M, Calle ML, Ceci M, Duman H, Gündoğdu A, Havulinna AS, Kaka Bra KHN, Kalluci E, Karav S, Lode D, Lopes MB, May P, Nap B, Nedyalkova M, Paciência I, Pasic L, Pujolassos M, Shigdel R, Susín A, Thiele I, Truică CO, Wilmes P, Yilmaz E, Yousef M, Claesson MJ, Truu J, Carrillo de Santa Pau E. Marcos-Zambrano LJ, et al. Front Microbiol. 2023 Nov 22;14:1250806. doi: 10.3389/fmicb.2023.1250806. eCollection 2023. Front Microbiol. 2023. PMID: 38075858 Free PMC article. Review.
  • Imaging and multi-omics datasets converge to define different neural progenitor origins for ATRT-SHH subgroups.
    Lobón-Iglesias MJ, Andrianteranagna M, Han ZY, Chauvin C, Masliah-Planchon J, Manriquez V, Tauziede-Espariat A, Turczynski S, Bouarich-Bourimi R, Frah M, Dufour C, Blauwblomme T, Cardoen L, Pierron G, Maillot L, Guillemot D, Reynaud S, Bourneix C, Pouponnot C, Surdez D, Bohec M, Baulande S, Delattre O, Piaggio E, Ayrault O, Waterfall JJ, Servant N, Beccaria K, Dangouloff-Ros V, Bourdeaut F. Lobón-Iglesias MJ, et al. Nat Commun. 2023 Oct 20;14(1):6669. doi: 10.1038/s41467-023-42371-7. Nat Commun. 2023. PMID: 37863903 Free PMC article.
  • Asterics: a simple tool for the ExploRation and Integration of omiCS data.
    Maigné É, Noirot C, Henry J, Adu Kesewaah Y, Badin L, Déjean S, Guilmineau C, Krebs A, Mathevet F, Segalini A, Thomassin L, Colongo D, Gaspin C, Liaubet L, Vialaneix N. Maigné É, et al. BMC Bioinformatics. 2023 Oct 18;24(1):391. doi: 10.1186/s12859-023-05504-9. BMC Bioinformatics. 2023. PMID: 37853347 Free PMC article.
  • Improvement of variables interpretability in kernel PCA.
    Briscik M, Dillies MA, Déjean S. Briscik M, et al. BMC Bioinformatics. 2023 Jul 12;24(1):282. doi: 10.1186/s12859-023-05404-y. BMC Bioinformatics. 2023. PMID: 37438763 Free PMC article.

LinkOut - more resources