Accurate prediction of deleterious protein kinase polymorphisms

Bioinformatics. 2007 Nov 1;23(21):2918-25. doi: 10.1093/bioinformatics/btm437. Epub 2007 Sep 12.

Abstract

Motivation: Contemporary, high-throughput sequencing efforts have identified a rich source of naturally occurring single nucleotide polymorphisms (SNPs), a subset of which occur in the coding region of genes and result in a change in the encoded amino acid sequence (non-synonymous coding SNPs or 'nsSNPs'). It is hypothesized that a subset of these nsSNPs may underlie common human disease. Testing all these polymorphisms for disease association would be time consuming and expensive. Thus, computational methods have been developed to both prioritize candidate nsSNPs and make sense of their likely molecular physiologic impact.

Results: We have developed a method to prioritize nsSNPs and have applied it to the human protein kinase gene family. The results of our analyses provide high quality predictions and outperform available whole genome prediction methods (74% versus 83% prediction accuracy). Our analyses and methods consider both DNA sequence conservation, which most traditional methods are based on, as well unique structural and functional features of kinases. We provide a ranked list of common kinase nsSNPs that have a higher probability of impacting human disease based on our analyses.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Sequence
  • Chromosome Mapping / methods*
  • DNA Mutational Analysis / methods*
  • Genetic Predisposition to Disease / genetics*
  • Humans
  • Molecular Sequence Data
  • Polymorphism, Single Nucleotide / genetics*
  • Protein Kinases / genetics*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Sequence Analysis, DNA / methods*

Substances

  • Protein Kinases