Motivation: Tumors exhibit numerous genomic lesions such as copy number variations, structural variations and sequence variations. It is difficult to determine whether a specific constellation of lesions observed across a cohort of multiple tumors provides statistically significant evidence that the lesions target a set of genes that may be located across different chromosomes but yet are all involved in a single specific biological process or function.
Results: We introduce the genomic random interval (GRIN) statistical model and analysis method that evaluates the statistical significance of the abundance of genomic lesions that overlap a specific locus or a pre-defined set of biologically related loci. The GRIN model retains certain biologically important properties of genomic lesions that are ignored by other methods. In a simulation study and two example analyses of leukemia genomic lesion data, GRIN more effectively identified important loci as significant than did three methods based on a permutation-of-markers model. GRIN also identified biologically relevant pathways with a significant abundance of lesions in both examples.
Availability: An R package will be freely available at CRAN and www.stjuderesearch.org/site/depts/biostats/software.