A computational procedure to identify significant overlap of differentially expressed and genomic imbalanced regions in cancer datasets

Nucleic Acids Res. 2009 Aug;37(15):5057-70. doi: 10.1093/nar/gkp520. Epub 2009 Jun 19.

Abstract

The integration of high-throughput genomic data represents an opportunity for deciphering the interplay between structural and functional organization of genomes and for discovering novel biomarkers. However, the development of integrative approaches to complement gene expression (GE) data with other types of gene information, such as copy number (CN) and chromosomal localization, still represents a computational challenge in the genomic arena. This work presents a computational procedure that directly integrates CN and GE profiles at genome-wide level. When applied to DNA/RNA paired data, this approach leads to the identification of Significant Overlaps of Differentially Expressed and Genomic Imbalanced Regions (SODEGIR). This goal is accomplished in three steps. The first step extends to CN a method for detecting regional imbalances in GE. The second part provides the integration of CN and GE data and identifies chromosomal regions with concordantly altered genomic and transcriptional status in a tumor sample. The last step elevates the single-sample analysis to an entire dataset of tumor specimens. When applied to study chromosomal aberrations in a collection of astrocytoma and renal carcinoma samples, the procedure proved to be effective in identifying discrete chromosomal regions of coordinated CN alterations and changes in transcriptional levels.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Astrocytoma / genetics
  • Carcinoma / genetics
  • Chromosome Aberrations
  • Data Interpretation, Statistical
  • Gene Dosage
  • Gene Expression Profiling*
  • Genomics / methods*
  • Humans
  • Kidney Neoplasms / genetics
  • Neoplasms / genetics*
  • Oligonucleotide Array Sequence Analysis
  • Polymorphism, Single Nucleotide