We have developed a formalism and a computational method for analyzing the potential functional consequences of non-synonymous single nucleotide polymorphisms. Our approach uses a structural model and phylogenetic information to derive a selection of structure and sequence-based features serving as indicators of an amino acid polymorphim's effect on function. The feature values can be integrated into a probabilistic assessment of whether an amino acid polymorphism will affect the function or stability of a target protein. The method has been validated with data sets of unbiased mutations in the lac repressor and lysoyzyme. Applying our methodology to recent surveys of genetic variation in the coding regions of clinically important genes, we estimate that approximately 26-32 % of the natural non-synonymous single nucleotide polymorphisms have effects on function. This estimate suggests that a typical person will have about 6240-12,800 heterozygous loci that encode proteins with functional variation due to natural amino acid polymorphism.
Copyright 2001 Academic Press.