Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Dec 21;6:305.
doi: 10.1186/1471-2105-6-305.

The Gene Set Builder: Collation, Curation, and Distribution of Sets of Genes

Free PMC article

The Gene Set Builder: Collation, Curation, and Distribution of Sets of Genes

Dimas Yusuf et al. BMC Bioinformatics. .
Free PMC article


Background: In bioinformatics and genomics, there are many applications designed to investigate the common properties for a set of genes. Often, these multi-gene analysis tools attempt to reveal sequential, functional, and expressional ties. However, while tremendous effort has been invested in developing tools that can analyze a set of genes, minimal effort has been invested in developing tools that can help researchers compile, store, and annotate gene sets in the first place. As a result, the process of making or accessing a set often involves tedious and time consuming steps such as finding identifiers for each individual gene. These steps are often repeated extensively to shift from one identifier type to another; or to recreate a published set. In this paper, we present a simple online tool which - with the help of the gene catalogs Ensembl and GeneLynx - can help researchers build and annotate sets of genes quickly and easily.

Description: The Gene Set Builder is a database-driven, web-based tool designed to help researchers compile, store, export, and share sets of genes. This application supports the 17 eukaryotic genomes found in version 32 of the Ensembl database, which includes species from yeast to human. User-created information such as sets and customized annotations are stored to facilitate easy access. Gene sets stored in the system can be "exported" in a variety of output formats - as lists of identifiers, in tables, or as sequences. In addition, gene sets can be "shared" with specific users to facilitate collaborations or fully released to provide access to published results. The application also features a Perl API (Application Programming Interface) for direct connectivity to custom analysis tools. A downloadable Quick Reference guide and an online tutorial are available to help new users learn its functionalities.

Conclusion: The Gene Set Builder is an Ensembl-facilitated online tool designed to help researchers compile and manage sets of genes in a user-friendly environment. The application can be accessed via


Figure 1
Figure 1
A screen capture of Gene Set Builder. This "special edition" user interface utilizes a Flash-based navigation system, complete with animation and tool tips.

Similar articles

  • Atlas - a data warehouse for integrative bioinformatics.
    Shah SP, Huang Y, Xu T, Yuen MM, Ling J, Ouellette BF. Shah SP, et al. BMC Bioinformatics. 2005 Feb 21;6:34. doi: 10.1186/1471-2105-6-34. BMC Bioinformatics. 2005. PMID: 15723693 Free PMC article.
  • GeneTrail--advanced gene set enrichment analysis.
    Backes C, Keller A, Kuentzer J, Kneissl B, Comtesse N, Elnakady YA, Müller R, Meese E, Lenhof HP. Backes C, et al. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W186-92. doi: 10.1093/nar/gkm323. Epub 2007 May 25. Nucleic Acids Res. 2007. PMID: 17526521 Free PMC article.
  • GeneKeyDB: a lightweight, gene-centric, relational database to support data mining environments.
    Kirov SA, Peng X, Baker E, Schmoyer D, Zhang B, Snoddy J. Kirov SA, et al. BMC Bioinformatics. 2005 Mar 24;6:72. doi: 10.1186/1471-2105-6-72. BMC Bioinformatics. 2005. PMID: 15790402 Free PMC article.
  • Cross-organism analysis using InterMine.
    Lyne R, Sullivan J, Butano D, Contrino S, Heimbach J, Hu F, Kalderimis A, Lyne M, Smith RN, Štěpán R, Balakrishnan R, Binkley G, Harris T, Karra K, Moxon SA, Motenko H, Neuhauser S, Ruzicka L, Cherry M, Richardson J, Stein L, Westerfield M, Worthey E, Micklem G. Lyne R, et al. Genesis. 2015 Aug;53(8):547-60. doi: 10.1002/dvg.22869. Epub 2015 Jul 8. Genesis. 2015. PMID: 26097192 Free PMC article. Review.
  • Mitochondrial Disease Sequence Data Resource (MSeqDR): a global grass-roots consortium to facilitate deposition, curation, annotation, and integrated analysis of genomic data for the mitochondrial disease clinical and research communities.
    Falk MJ, Shen L, Gonzalez M, Leipzig J, Lott MT, Stassen AP, Diroma MA, Navarro-Gomez D, Yeske P, Bai R, Boles RG, Brilhante V, Ralph D, DaRe JT, Shelton R, Terry SF, Zhang Z, Copeland WC, van Oven M, Prokisch H, Wallace DC, Attimonelli M, Krotoski D, Zuchner S, Gai X; MSeqDR Consortium Participants; MSeqDR Consortium participants: Sherri Bale, Jirair Bedoyan, Doron Behar, Penelope Bonnen, Lisa Brooks, Claudia Calabrese, Sarah Calvo, Patrick Chinnery, John Christodoulou, Deanna Church,; Rosanna Clima, Bruce H. Cohen, Richard G. Cotton, IFM de Coo, Olga Derbenevoa, Johan T. den Dunnen, David Dimmock, Gregory Enns, Giuseppe Gasparre,; Amy Goldstein, Iris Gonzalez, Katrina Gwinn, Sihoun Hahn, Richard H. Haas, Hakon Hakonarson, Michio Hirano, Douglas Kerr, Dong Li, Maria Lvova, Finley Macrae, Donna Maglott, Elizabeth McCormick, Grant Mitchell, Vamsi K. Mootha, Yasushi Okazaki,; Aurora Pujol, Melissa Parisi, Juan Carlos Perin, Eric A. Pierce, Vincent Procaccio, Shamima Rahman, Honey Reddi, Heidi Rehm, Erin Riggs, Richard Rodenburg, Yaffa Rubinstein, Russell Saneto, Mariangela Santorsola, Curt Scharfe,; Claire Sheldon, Eric A. Shoubridge, Domenico Simone, Bert Smeets, Jan A. Smeitink, Christine Stanley, Anu Suomalainen, Mark Tarnopolsky, Isabelle Thiffault, David R. Thorburn, Johan Van Hove, Lynne Wolfe, and Lee-Jun Wong. Falk MJ, et al. Mol Genet Metab. 2015 Mar;114(3):388-96. doi: 10.1016/j.ymgme.2014.11.016. Epub 2014 Dec 4. Mol Genet Metab. 2015. PMID: 25542617 Free PMC article. Review.
See all similar articles

Cited by 2 articles


    1. Ho Sui SJ, Mortimer JR, Arenillas DJ, Brumm J, Walsh CJ, Kennedy BP, Wasserman WW. oPOSSUM: identification of over-represented transcription factor binding sites in co-expressed genes. Nucleic Acids Res. 2005;33:3154–64. doi: 10.1093/nar/gki624. - DOI - PMC - PubMed
    1. Martin D, Brun C, Remy E, Mouren P, Thieffry D, Jacq B. GOToolBox: functional analysis of gene datasets based on Gene Ontology. Genome Biol. 2004;5:R101. doi: 10.1186/gb-2004-5-12-r101. - DOI - PMC - PubMed
    1. Lenhard B, Hayes WS, Wasserman WW. GeneLynx: a gene-centric portal to the human genome. Genome Research. 2001;11:2151–2157. doi: 10.1101/gr.199801. - DOI - PMC - PubMed
    1. Stajich JE, Block D, Boulez K, Brenner SE, Chervitz SA, Dagdigian C, Fuellen G, Gilbert JG, Korf I, Lapp H, Lehvaslaiho H, Matsalla C, Mungall CJ, Osborne BI, Pocock MR, Schattner P, Senger M, Stein LD, Stupka E, Wilkinson MD, Birney E. The Bioperl Toolkit: Perl Modules for the Life Sciences. Genome Research. 2002;12:1611–1618. doi: 10.1101/gr.361602. - DOI - PMC - PubMed
    1. Hubbard T, Andrews D, Caccamo M, Cameron G, Chen Y, Clamp M, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, Cutts T, Down T, Durbin R, Fernandez-Suarez XM, Gilbert J, Hammond M, Herrero J, Hotz H, Howe K, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Keenan S, Kokocinsci F, London D, Longden I, McVicker G, Melsopp C, Meidl P, Potter S, Proctor G, Rae M, Rios D, Schuster M, Searle S, Severin J, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Trevanion S, Ureta-Vidal A, Vogel J, White S, Woodwark C, Birney E. Ensembl 2005. Nucleic Acids Res. 2005:D447–D453. - PMC - PubMed

Publication types

LinkOut - more resources