Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt
- PMID: 19617889
- PMCID: PMC3159387
- DOI: 10.1038/nprot.2009.97
Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt
Abstract
Genomic experiments produce multiple views of biological systems, among them are DNA sequence and copy number variation, and mRNA and protein abundance. Understanding these systems needs integrated bioinformatic analysis. Public databases such as Ensembl provide relationships and mappings between the relevant sets of probe and target molecules. However, the relationships can be biologically complex and the content of the databases is dynamic. We demonstrate how to use the computational environment R to integrate and jointly analyze experimental datasets, employing BioMart web services to provide the molecule mappings. We also discuss typical problems that are encountered in making gene-to-transcript-to-protein mappings. The approach provides a flexible, programmable and reproducible basis for state-of-the-art bioinformatic data integration.
Conflict of interest statement
The authors declare that they have no competing financial interests.
Figures
Similar articles
-
BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis.Bioinformatics. 2005 Aug 15;21(16):3439-40. doi: 10.1093/bioinformatics/bti525. Bioinformatics. 2005. PMID: 16082012
-
GATExplorer: genomic and transcriptomic explorer; mapping expression probes to gene loci, transcripts, exons and ncRNAs.BMC Bioinformatics. 2010 Apr 29;11:221. doi: 10.1186/1471-2105-11-221. BMC Bioinformatics. 2010. PMID: 20429936 Free PMC article.
-
pubmed2ensembl: a resource for mining the biological literature on genes.PLoS One. 2011;6(9):e24716. doi: 10.1371/journal.pone.0024716. Epub 2011 Sep 29. PLoS One. 2011. PMID: 21980353 Free PMC article.
-
GenomeGraphs: integrated genomic data visualization with R.BMC Bioinformatics. 2009 Jan 6;10:2. doi: 10.1186/1471-2105-10-2. BMC Bioinformatics. 2009. PMID: 19123956 Free PMC article.
-
Integrating multiple 'omics' analysis for microbial biology: application and methodologies.Microbiology (Reading). 2010 Feb;156(Pt 2):287-301. doi: 10.1099/mic.0.034793-0. Epub 2009 Nov 12. Microbiology (Reading). 2010. PMID: 19910409 Review.
Cited by
-
Evolution and Expression of the Immune System of a Facultatively Anadromous Salmonid.Front Immunol. 2021 Feb 26;12:568729. doi: 10.3389/fimmu.2021.568729. eCollection 2021. Front Immunol. 2021. PMID: 33717060 Free PMC article.
-
Epigenome-wide association study for transgenerational disease sperm epimutation biomarkers following ancestral exposure to jet fuel hydrocarbons.Reprod Toxicol. 2020 Dec;98:61-74. doi: 10.1016/j.reprotox.2020.08.010. Epub 2020 Sep 6. Reprod Toxicol. 2020. PMID: 32905848 Free PMC article.
-
Transcriptomic Signatures of Ageing Vary in Solitary and Social Forms of an Orchid Bee.Genome Biol Evol. 2021 Jun 8;13(6):evab075. doi: 10.1093/gbe/evab075. Genome Biol Evol. 2021. PMID: 33914875 Free PMC article.
-
TCGAnalyzeR: An Online Pan-Cancer Tool for Integrative Visualization of Molecular and Clinical Data of Cancer Patients for Cohort and Associated Gene Discovery.Cancers (Basel). 2024 Jan 13;16(2):345. doi: 10.3390/cancers16020345. Cancers (Basel). 2024. PMID: 38254834 Free PMC article.
-
Localized skin inflammation during cutaneous leishmaniasis drives a chronic, systemic IFN-γ signature.PLoS Negl Trop Dis. 2021 Apr 1;15(4):e0009321. doi: 10.1371/journal.pntd.0009321. eCollection 2021 Apr. PLoS Negl Trop Dis. 2021. PMID: 33793565 Free PMC article.
References
-
- R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing; Vienna, Austria: 2008.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
