Using galaxy to perform large-scale interactive data analyses

James Taylor; Ian Schenck; Dan Blankenberg; Anton Nekrutenko

doi:10.1002/0471250953.bi1005s19

Using galaxy to perform large-scale interactive data analyses

Curr Protoc Bioinformatics. 2007 Sep:Chapter 10:Unit 10.5. doi: 10.1002/0471250953.bi1005s19.

Authors

James Taylor¹, Ian Schenck, Dan Blankenberg, Anton Nekrutenko

Affiliation

¹ New York University, New York, New York, USA.

Abstract

While most experimental biologists know where to download genomic data, few have a concrete plan on how to analyze it. This situation can be corrected by: (1) providing unified portals serving genomic data and (2) building Web applications to allow flexible retrieval and on-the-fly analyses of the data. Powerful resources, such as the UCSC Genome Browser already address the first issue. The second issue, however, remains open. For example, how to find human protein-coding exons with the highest density of single nucleotide polymorphisms (SNPs) and extract orthologous sequences from all sequenced mammals? Indeed, one can access all relevant data from the UCSC Genome Browser. But once the data is downloaded how would one deal with millions of SNPs and gigabytes of alignments? Galaxy (http://g2.bx.psu.edu) is designed specifically for that purpose. It amplifies the strengths of existing resources (such as UCSC Genome Browser) by allowing the user to access and, most importantly, analyze data within a single interface in an unprecedented number of ways.

Publication types

Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms*
Base Sequence
Chromosome Mapping / methods*
Computer Graphics
DNA / genetics*
DNA Mutational Analysis / methods*
Molecular Sequence Data
Sequence Alignment / methods*
Sequence Analysis, DNA / methods*
Software*
User-Computer Interface*

Substances

DNA

Abstract

Publication types

MeSH terms

Substances

Grants and funding