Background: Genetic studies have provided ample evidence of the influence of non-coding DNA polymorphisms on trait variance, particularly those occurring within transcription factor binding sites. Protein binding microarrays and other platforms that can map these sites with great precision have enhanced our understanding of how a single nucleotide polymorphism can alter binding potential within an in vitro setting, allowing for greater predictive capability of its effect on a transcription factor binding site.
Results: We have used protein binding microarrays and electrophoretic mobility shift assay-sequencing (EMSA-Seq), a deep sequencing based method we developed to analyze nine distinct human NF-κB dimers. This family of transcription factors is one of the most extensively studied, but our understanding of its DNA binding preferences has been limited to the originally described consensus motif, GGRRNNYYCC. We highlight differences between NF-κB family members and also put under the spotlight non-canonical motifs that have so far received little attention. We utilize our data to interpret the binding of transcription factors between individuals across 1,405 genomic regions laden with single nucleotide polymorphisms. We also associated binding correlations made using our data with risk alleles of disease and demonstrate its utility as a tool for functional studies of single nucleotide polymorphisms in regulatory regions.
Conclusions: NF-κB dimers bind specifically to non-canonical motifs and these can be found within genomic regions in which a canonical motif is not evident. Binding affinity data generated with these different motifs can be used in conjunction with data from chromatin immunoprecipitation-sequencing (ChIP-Seq) to enable allele-specific analyses of expression and transcription factor-DNA interactions on a genome-wide scale.