Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 1418, 93-110

The Gene Expression Omnibus Database

Affiliations

The Gene Expression Omnibus Database

Emily Clough et al. Methods Mol Biol.

Abstract

The Gene Expression Omnibus (GEO) database is an international public repository that archives and freely distributes high-throughput gene expression and other functional genomics data sets. Created in 2000 as a worldwide resource for gene expression studies, GEO has evolved with rapidly changing technologies and now accepts high-throughput data for many other data applications, including those that examine genome methylation, chromatin structure, and genome-protein interactions. GEO supports community-derived reporting standards that specify provision of several critical study elements including raw data, processed data, and descriptive metadata. The database not only provides access to data for tens of thousands of studies, but also offers various Web-based tools and strategies that enable users to locate data relevant to their specific interests, as well as to visualize and analyze the data. This chapter includes detailed descriptions of methods to query and download GEO data and use the analysis and visualization tools. The GEO homepage is at http://www.ncbi.nlm.nih.gov/geo/.

Keywords: Data mining; Database; Functional genomics; Gene expression; High-throughput sequencing; Microarray.

Figures

Fig. 1
Fig. 1
Workflow screenshots. After typing a search term into the GEO DataSets search box (1), and using the filter feature to restrict to DataSet entries (2), the user retrieves 28 relevant records (3). The user selects the second DataSet, GDS4882, and uses the “Find genes” feature in the DataSet Analysis Tools to search for gene CREB5 in that DataSet (4). Workflow continues in Fig 2.
Fig. 2
Fig. 2
Workflow screenshots (continued). The “Find genes” feature in the DataSet Analysis Tools (in Fig. 1) creates a search for gene CREB5 in DataSet GDS4882 (1). The user is presented with 3 results in GEO Profiles (2), meaning that the CREB5 gene is represented by 3 separate probesets on the Platform in GDS4882. Looking at the chart images, the user can immediately see that all 3 CREB5 probesets exhibit a similar expression pattern. Clicking on the top chart reveals a detailed graphic (3), where the user can see that CREB5 is more highly expressed in the hepatocellular Samples, compared to the other Samples examined in that DataSet.
Fig. 3
Fig. 3
Screenshot of NCBI Genome Data Viewer. The left side of the viewer has tools for locating specific regions of the genome (1). The tracks area depicts RefSeq gene, CpG island, and SNP tracks which are set as default for context (2), and a track for GEO Sample GSM1586398 which is a H4K3me3 histone ChIP-seq experiment performed on liver tissue (3). This track shows a typical H3K4me3 double peak with depletion at the transcriptional start site of gene NBEAL1.

Similar articles

See all similar articles

Cited by 119 PubMed Central articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback