Similarity of markers identified from cancer gene expression studies: observations from GEO

Brief Bioinform. 2014 Sep;15(5):671-84. doi: 10.1093/bib/bbt044. Epub 2013 Jun 19.

Abstract

Gene expression profiling has been extensively conducted in cancer research. The analysis of multiple independent cancer gene expression datasets may provide additional information and complement single-dataset analysis. In this study, we conduct multi-dataset analysis and are interested in evaluating the similarity of cancer-associated genes identified from different datasets. The first objective of this study is to briefly review some statistical methods that can be used for such evaluation. Both marginal analysis and joint analysis methods are reviewed. The second objective is to apply those methods to 26 Gene Expression Omnibus (GEO) datasets on five types of cancers. Our analysis suggests that for the same cancer, the marker identification results may vary significantly across datasets, and different datasets share few common genes. In addition, datasets on different cancers share few common genes. The shared genetic basis of datasets on the same or different cancers, which has been suggested in the literature, is not observed in the analysis of GEO data.

Keywords: GEO; cancer gene expression study; marker identification; similarity.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Biomarkers, Tumor / metabolism*
  • Gene Expression Profiling*
  • Humans
  • Models, Theoretical
  • Neoplasms / genetics*

Substances

  • Biomarkers, Tumor