CLIMB: High-dimensional association detection in large scale genomic data

Nat Commun. 2022 Nov 12;13(1):6874. doi: 10.1038/s41467-022-34360-z.

Abstract

Joint analyses of genomic datasets obtained in multiple different conditions are essential for understanding the biological mechanism that drives tissue-specificity and cell differentiation, but they still remain computationally challenging. To address this we introduce CLIMB (Composite LIkelihood eMpirical Bayes), a statistical methodology that learns patterns of condition-specificity present in genomic data. CLIMB provides a generic framework facilitating a host of analyses, such as clustering genomic features sharing similar condition-specific patterns and identifying which of these features are involved in cell fate commitment. We apply CLIMB to three sets of hematopoietic data, which examine CTCF ChIP-seq measured in 17 different cell populations, RNA-seq measured across constituent cell populations in three committed lineages, and DNase-seq in 38 cell populations. Our results show that CLIMB improves upon existing alternatives in statistical precision, while capturing interpretable and biologically relevant clusters in the data.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bayes Theorem
  • Cluster Analysis
  • Genome*
  • Genomics*
  • Sequence Analysis, DNA / methods