Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2003 Jul 1;31(13):3812-4.
doi: 10.1093/nar/gkg509.

SIFT: Predicting Amino Acid Changes That Affect Protein Function

Affiliations
Free PMC article

SIFT: Predicting Amino Acid Changes That Affect Protein Function

Pauline C Ng et al. Nucleic Acids Res. .
Free PMC article

Abstract

Single nucleotide polymorphism (SNP) studies and random mutagenesis projects identify amino acid substitutions in protein-coding regions. Each substitution has the potential to affect protein function. SIFT (Sorting Intolerant From Tolerant) is a program that predicts whether an amino acid substitution affects protein function so that users can prioritize substitutions for further study. We have shown that SIFT can distinguish between functionally neutral and deleterious amino acid changes in mutagenesis studies and on human polymorphisms. SIFT is available at http://blocks.fhcrc.org/sift/SIFT.html.

Figures

Figure 1
Figure 1
An example of SIFT prediction on amino acid changes in a protein. Substitutions with score less than 0.05 are predicted to affect protein function. In the last prediction, the median conservation of the sequences does not meet the threshold so a warning is issued.
Figure 2
Figure 2
Prediction depends on the diversity of the sequences used in the alignment. Percentage of substitutions correctly predicted is based on over 4000 substitutions that were assayed throughout the LacI protein of Escherichia coli (2,12). When the sequences in the alignment used for prediction are closely related (high median conservation) then many positions appear conserved and important for function. In this situation, prediction accuracy on deleterious substitutions is high but many functionally neutral substitutions are erroneously predicted to be deleterious. To obtain an alignment with a specified median conservation, the LacI protein sequence of E.coli was submitted to the SIFT website and the median conservation setting adjusted. Because the homologous sequences available are distantly related to E.coli LacI, alignments with higher median conservation values could not be obtained. In order to obtain alignments with median conservation values more than 3.25, closely related sequences were simulated by starting with an alignment of identical E.coli LacI sequences. A position and a sequence were randomly selected from the LacI alignment with median conservation 2.75. The amino acid corresponding to this location was substituted in the starting alignment. Amino acids continued to be randomly selected and substituted until the desired median conservation was met. The simulated alignment was then evaluated for its performance as previously described (2) and the plotted value is the average performance of 100 simulated alignments.

Similar articles

See all similar articles

Cited by 1,741 articles

See all "Cited by" articles

Publication types

Feedback