caGEDA: a web application for the integrated analysis of global gene expression patterns in cancer

Appl Bioinformatics. 2004;3(1):49-62. doi: 10.2165/00822942-200403010-00007.


The explosion of microarray data from pilot studies, basic research and large-scale clinical trials requires the development of integrative computational tools that can not only analyse gene expression patterns but that can also evaluate the methods of analysis adopted and then provide a boost to post-analysis translational interpretation of those patterns. We have developed a web application called caGEDA (cancer gene expression data analyzer) that can: (1) upload gene expression profiles from cDNA or oligonucleotide microarrays; (2) conduct a diverse range of serial linear normalisations; (3) identify differentially expressed genes using a variety of tests - either threshold or permutation tests; (4) produce tables of literature references to papers reporting that specific genes (identified by accession numbers) are up- or down-regulated in specific cancers; (5) estimate the error of sample class prediction using the significant gene set for features; (6) perform low-bias and accurate validated learning using three computational validation techniques (leave-one out validation, k-fold validation, random re-sampling validation); and (7) validate a classifier with a randomly selected or user-defined validation set. Significant genes are reported in a table of links to entries in the following databases: Locus Link, Genome View, UCSC, Ensembl, UniGene, dbSNP, AmiGO and OMIM. caGEDA is seamlessly integrated via embedded forms with UCSD's (University of California at San Diego) 2HAPI server (for medical subject heading (MeSH) term exploration) and EZ-Retrieve (to identify common transcription factors located upstream of sets of genes that exhibit similar modes of differential expression). caGEDA offers a variety of previously described and novel tests for differentially expressed genes, most notably the permutation percentile separability test, which is most appropriate for identifying genes that are significantly differentially expressed in a subset of patients. caGEDA, which is open source and free to academic users, will soon be greatly enhanced by operating with the components of the National Cancer Institute's new cancer bioinformatics grid (caBIG).

MeSH terms

  • Biomarkers, Tumor / chemistry
  • Biomarkers, Tumor / classification
  • Biomarkers, Tumor / metabolism*
  • Gene Expression Profiling / methods*
  • Humans
  • Internet
  • Neoplasm Proteins / chemistry
  • Neoplasm Proteins / classification
  • Neoplasm Proteins / metabolism*
  • Neoplasms / diagnosis*
  • Neoplasms / metabolism*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Software*
  • Systems Integration
  • User-Computer Interface*


  • Biomarkers, Tumor
  • Neoplasm Proteins