Res2s2aM: Deep residual network-based model for identifying functional noncoding SNPs in trait-associated regions

Pac Symp Biocomput. 2019:24:76-87.

Abstract

Noncoding single nucleotide polymorphisms (SNPs) and their target genes are important components of the heritability of diseases and other polygenic traits. Identifying these SNPs and target genes could potentially reveal new molecular mechanisms and advance precision medicine. For polygenic traits, genome-wide association studies (GWAS) are preferred tools for identifying trait-associated regions. However, identifying causal noncoding SNPs within such regions is a difficult problem in computational biology. The DNA sequence context of a noncoding SNP is well-established as an important source of information that is beneficial for discriminating functional from nonfunctional noncoding SNPs. We describe the use of a deep residual network (ResNet)-based model-entitled Res2s2aM-that fuses anking DNA sequence information with additional SNP annotation information to discriminate functional from nonfunctional noncoding SNPs. On a ground-truth set of disease-associated SNPs compiled from the Genome-wide Repository of Associations between SNPs and Phenotypes (GRASP) database, Res2s2aM improves the prediction accuracy of functional SNPs significantly in comparison to models based only on sequence information as well as a leading tool for post-GWAS noncoding SNP prioritization (RegulomeDB).

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Computational Biology
  • Databases, Nucleic Acid / statistics & numerical data
  • Deep Learning*
  • Genome-Wide Association Study / statistics & numerical data
  • Humans
  • Models, Genetic
  • Molecular Sequence Annotation
  • Neural Networks, Computer*
  • Polymorphism, Single Nucleotide*
  • Sequence Analysis, DNA