atSNP: transcription factor binding affinity testing for regulatory SNP detection

Bioinformatics. 2015 Oct 15;31(20):3353-5. doi: 10.1093/bioinformatics/btv328. Epub 2015 Jun 18.

Abstract

Motivation: Genome-wide association studies revealed that most disease-associated single nucleotide polymorphisms (SNPs) are located in regulatory regions within introns or in regions between genes. Regulatory SNPs (rSNPs) are such SNPs that affect gene regulation by changing transcription factor (TF) binding affinities to genomic sequences. Identifying potential rSNPs is crucial for understanding disease mechanisms. In silico methods that evaluate the impact of SNPs on TF binding affinities are not scalable for large-scale analysis.

Results: We describe A: ffinity T: esting for regulatory SNP: s (atSNP), a computationally efficient R package for identifying rSNPs in silico. atSNP implements an importance sampling algorithm coupled with a first-order Markov model for the background nucleotide sequences to test the significance of affinity scores and SNP-driven changes in these scores. Application of atSNP with >20 K SNPs indicates that atSNP is the only available tool for such a large-scale task. atSNP provides user-friendly output in the form of both tables and composite logo plots for visualizing SNP-motif interactions. Evaluations of atSNP with known rSNP-TF interactions indicate that SNP is able to prioritize motifs for a given set of SNPs with high accuracy.

Availability and implementation: https://github.com/keleslab/atSNP.

Contact: keles@stat.wisc.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Databases, Genetic
  • Gene Expression Regulation*
  • Genome, Human
  • Genome-Wide Association Study
  • Genomics / methods*
  • Humans
  • Polymorphism, Single Nucleotide / genetics*
  • Protein Binding
  • Regulatory Sequences, Nucleic Acid / genetics*
  • Transcription Factors / metabolism*

Substances

  • Transcription Factors