Identification of novel highly expressed genes in pancreatic ductal adenocarcinomas through a bioinformatics analysis of expressed sequence tags

Cancer Biol Ther. 2004 Nov;3(11):1081-9; discussion 1090-1. doi: 10.4161/cbt.3.11.1175. Epub 2004 Nov 12.


In most microarray experiments, a significant fraction of the differentially expressed mRNAs identified correspond to expressed sequence tags (ESTs) and are generally discarded from further analyses. We used careful bioinformatics analyses to characterize those ESTs that were found to be highly overexpressed in a series of pancreatic adenocarcinomas. cDNA was prepared from 60 non-neoplastic samples (normal pancreas [n = 20], normal colon [n = 10], or normal duodenal mucosal [n = 30]) and from 64 pancreatic cancers (resected cancers [n = 50] or cancer cell lines [n = 14]) and hybridized to the complete Affymetrix Human Genome U133 GeneChip(R) set (arrays U133A and B) for simultaneous analysis of 45,000 fragments corresponding to 33,000 known genes and 6,000 ESTs. The GeneExpress(R) software system Fold Change Analysis Tool was used and 60 ESTs were identified that were expressed at levels at least 3-fold greater in the pancreatic cancers as compared to normal tissues. Searches against the human genomic sequence and comparative genomic analysis of human and mouse genomes was carried out using basic local alignment search tools (BLAST), BLASTN, and BLASTX, for identifying protein coding genes corresponding to the ESTs. Subsequently, in order to pick the most relevant candidate genes for a more detailed analysis, we looked for domains/motifs in the open reading frames using SMART and Pfam programs. We were able to definitively map 43 of the 60 ESTs to known or novel genes, and 15 of the ESTs could be localized in close proximity to a gene in the human genome although we were unable to establish that the EST was indeed derived from those genes. The differential expression of a subset of genes was confirmed at the protein level by immunohistochemical labeling of tissue microarrays (inhibin beta A [INHBA] and CD29) and/or at the transcript level by RT-PCR (INHBA, AKAP12, ELK3, FOXQ1, EIF5A2, and EFNA5). We conclude that bioinformatics tools can be used to characterize differentially overexpressed ESTs, and that some of these ESTs may represent diagnostically and therapeutically useful targets that might be missed using data solely from currently annotated databases.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Adenocarcinoma / genetics
  • Adenocarcinoma / metabolism
  • Biomarkers, Tumor / genetics*
  • Carcinoma, Pancreatic Ductal / genetics*
  • Carcinoma, Pancreatic Ductal / metabolism
  • Computational Biology*
  • DNA, Complementary
  • Expressed Sequence Tags*
  • Gene Expression Profiling*
  • Humans
  • Oligonucleotide Array Sequence Analysis
  • Pancreatic Neoplasms / genetics*
  • Pancreatic Neoplasms / metabolism
  • RNA, Messenger / genetics
  • RNA, Messenger / metabolism
  • Tumor Cells, Cultured


  • Biomarkers, Tumor
  • DNA, Complementary
  • RNA, Messenger