A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals

Nat Commun. 2016 Apr 18;7:11101. doi: 10.1038/ncomms11101.

Abstract

Large-scale sequencing in the 1000 Genomes Project has revealed multitudes of single nucleotide variants (SNVs). Here, we provide insights into the functional effect of these variants using allele-specific behaviour. This can be assessed for an individual by mapping ChIP-seq and RNA-seq reads to a personal genome, and then measuring 'allelic imbalances' between the numbers of reads mapped to the paternal and maternal chromosomes. We annotate variants associated with allele-specific binding and expression in 382 individuals by uniformly processing 1,263 functional genomics data sets, developing approaches to reduce the heterogeneity between data sets due to overdispersion and mapping bias. Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'. We also found SNVs for which we can anticipate allelic imbalance from the disruption of a binding motif. Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Binding Sites / genetics
  • Chromosome Mapping / methods*
  • Computational Biology / methods
  • Databases, Genetic
  • Gene Expression
  • Gene Frequency
  • Genome, Human / genetics*
  • Genomics / methods*
  • Genotype
  • High-Throughput Nucleotide Sequencing / methods*
  • Human Genome Project
  • Humans
  • Internet
  • Molecular Sequence Annotation / methods
  • Polymorphism, Single Nucleotide*
  • Precision Medicine / methods