Multiscale representation of genomic signals

Nat Methods. 2014 Jun;11(6):689-94. doi: 10.1038/nmeth.2924. Epub 2014 Apr 13.

Abstract

Genomic information is encoded on a wide range of distance scales, ranging from tens of bases to megabases. We developed a multiscale framework to analyze and visualize the information content of genomic signals. Different types of signals, such as G+C content or DNA methylation, are characterized by distinct patterns of signal enrichment or depletion across scales spanning several orders of magnitude. These patterns are associated with a variety of genomic annotations. By integrating the information across all scales, we demonstrated improved prediction of gene expression from polymerase II chromatin immunoprecipitation sequencing (ChIP-seq) measurements, and we observed that gene expression differences in colorectal cancer are related to methylation patterns that extend beyond the single-gene scale. Our software is available at https://github.com/tknijnen/msr/.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • DNA / chemistry
  • DNA Methylation
  • Genomics / methods*
  • Humans
  • Sequence Analysis, DNA
  • Software*
  • Transcriptome*

Substances

  • DNA

Associated data

  • GEO/GSE54414