Finding recurrent copy number alterations preserving within-sample homogeneity

Bioinformatics. 2011 Nov 1;27(21):2949-56. doi: 10.1093/bioinformatics/btr488. Epub 2011 Aug 25.


Motivation: Copy number alterations (CNAs) represent an important component of genetic variation and play a significant role in many human diseases. Development of array comparative genomic hybridization (aCGH) technology has made it possible to identify CNAs. Identification of recurrent CNAs represents the first fundamental step to provide a list of genomic regions which form the basis for further biological investigations. The main problem in recurrent CNAs discovery is related to the need to distinguish between functional changes and random events without pathological relevance. Within-sample homogeneity represents a common feature of copy number profile in cancer, so it can be used as additional source of information to increase the accuracy of the results. Although several algorithms aimed at the identification of recurrent CNAs have been proposed, no attempt of a comprehensive comparison of different approaches has yet been published.

Results: We propose a new approach, called Genomic Analysis of Important Alterations (GAIA), to find recurrent CNAs where a statistical hypothesis framework is extended to take into account within-sample homogeneity. Statistical significance and within-sample homogeneity are combined into an iterative procedure to extract the regions that likely are involved in functional changes. Results show that GAIA represents a valid alternative to other proposed approaches. In addition, we perform an accurate comparison by using two real aCGH datasets and a carefully planned simulation study.

Availability: GAIA has been implemented as R/Bioconductor package. It can be downloaded from the following page


Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Comparative Genomic Hybridization
  • Genomic Structural Variation*
  • Genomics
  • Humans
  • Neoplasms / genetics
  • Oligonucleotide Array Sequence Analysis / methods