The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models

Cell. 2023 Mar 30;186(7):1493-1511.e40. doi: 10.1016/j.cell.2023.02.018.

Abstract

Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.

Keywords: ENCODE; GTEx; allele-specific activity; eQTLs; functional epigenomes; functional genomics; genome annotations; personal genome; predictive models; structural variants; tissue specificity; transformer model.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Epigenome*
  • Genome-Wide Association Study
  • Genomics
  • Phenotype
  • Polymorphism, Single Nucleotide
  • Quantitative Trait Loci*