GINOM: A Statistical Framework for Assessing Interval Overlap of Multiple Genomic Features

PLoS Comput Biol. 2017 Jun 15;13(6):e1005586. doi: 10.1371/journal.pcbi.1005586. eCollection 2017 Jun.

Abstract

A common problem in genomics is to test for associations between two or more genomic features, typically represented as intervals interspersed across the genome. Existing methodologies can test for significant pairwise associations between two genomic intervals; however, they cannot test for associations involving multiple sets of intervals. This limits our ability to uncover more complex, yet biologically important associations between multiple sets of genomic features. We introduce GINOM (Genomic INterval Overlap Model), a new method that enables testing of significant associations between multiple genomic features. We demonstrate GINOM's ability to identify higher-order associations with both simulated and real data. In particular, we used GINOM to explore L1 retrotransposable element insertion bias in lung cancer and found a significant pairwise association between L1 insertions and heterochromatic marks. Unlike other methods, GINOM also detected an association between L1 insertions and gene bodies marked by a facultative heterochromatic mark, which could explain the observed bias for L1 insertions towards cancer-associated genes.

MeSH terms

  • Algorithms
  • Chromosome Mapping / methods*
  • Computer Simulation
  • Genome / genetics*
  • High-Throughput Nucleotide Sequencing
  • Models, Genetic
  • Models, Statistical*
  • Sequence Alignment
  • Sequence Analysis, DNA
  • Sequence Homology, Nucleic Acid*
  • Software