Are there analogous sequence positions in families of related proteins where disease-linked mutations occur with unusually high frequency? We attempt to answer this question by examining sequence alignments for G-protein coupled receptors (GPCRs) and voltage-gated potassium channels that have a significant number of missense mutations linked to some form of human disease. When the disease-linked mutations are mapped onto the sequences for each family, there are a large number of aligned sites at which disease-linked mutations occur in more than one protein. The statistical significance of the aligned sites is judged by analysis of artificially-generated random datasets. There are a modest number of aligned sites that are statistically significant-we refer to these as "phenotologous" sequence positions. Phenotologous sites represent aligned positions at which mutations linked to disease phenotypes occur with high frequency within a family of proteins. The most interesting of these sites are those which are not conserved-such sites are apparently critical in defining structural or functional differences between related proteins. Phenotology may be used to make experimentally testable predictions regarding medical genetics, the molecular basis of disease, and protein structure-function relationships.
(c) 2004 Wiley-Liss, Inc.