Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002 Feb;11(2):350-60.
doi: 10.1110/ps.18602.

Persistently Conserved Positions in Structurally Similar, Sequence Dissimilar Proteins: Roles in Preserving Protein Fold and Function

Affiliations
Free PMC article

Persistently Conserved Positions in Structurally Similar, Sequence Dissimilar Proteins: Roles in Preserving Protein Fold and Function

Iddo Friedberg et al. Protein Sci. .
Free PMC article

Abstract

Many protein pairs that share the same fold do not have any detectable sequence similarity, providing a valuable source of information for studying sequence-structure relationship. In this study, we use a stringent data set of structurally similar, sequence-dissimilar protein pairs to characterize residues that may play a role in the determination of protein structure and/or function. For each protein in the database, we identify amino-acid positions that show residue conservation within both close and distant family members. These positions are termed "persistently conserved". We then proceed to determine the "mutually" persistently conserved (MPC) positions: those structurally aligned positions in a protein pair that are persistently conserved in both pair mates. Because of their intra- and interfamily conservation, these positions are good candidates for determining protein fold and function. We find that 45% of the persistently conserved positions are mutually conserved. A significant fraction of them are located in critical positions for secondary structure determination, they are mostly buried, and many of them form spatial clusters within their protein structures. A substitution matrix based on the subset of MPC positions shows two distinct characteristics: (i) it is different from other available matrices, even those that are derived from structural alignments; (ii) its relative entropy is high, emphasizing the special residue restrictions imposed on these positions. Such a substitution matrix should be valuable for protein design experiments.

Figures

Fig. 1.
Fig. 1.
A schematic flowchart describing the identification of mutually persistently conserved positions. See text for details.
Fig. 2.
Fig. 2.
Distribution of residue types in mutually persistently conserved (MPC) positions expressed as the log-odds ratio between the frequency of a residue in MPC positions (obs) and its frequency in the entire database of SSSD proteins (exp). All frequency differences were found to be statistically significant by a κ2 test, except for Leucine, Asparagine, and Valine (marked with ‘∧’).
Fig. 3.
Fig. 3.
Amino-acid residue substitution matrices derived from (a) mutually persistently conserved positions and (b) all structurally aligned positions. Values are scaled to 1/10 bit.
Fig. 3.
Fig. 3.
Amino-acid residue substitution matrices derived from (a) mutually persistently conserved positions and (b) all structurally aligned positions. Values are scaled to 1/10 bit.
Fig. 4.
Fig. 4.
Comparison between sequence-derived and structure-derived substitution matrices. The amino-acid pair frequency distributions that were used for the derivation of the substitution matrices were compared by the Jensen-Shannon divergence. A series of BLOSUM matrices were compared with the mutually persistently conserved-derived matrix (filled squares) and with the structurally derived matrix (open circles).
Fig. 5.
Fig. 5.
Frequency of mutually persistently conserved (MPC) positions in secondary structure elements. The X-axis shows the positions in and flanking the secondary structure element (nomenclature after Aurora and Rose 1998). The flanking regions are marked with apostrophes, the in-element residues with digits, and the initial and terminal (capping) residues with a "c." The Y-axis is the logarithm of the ratio between the actual frequency of MPC residues in a position and that expected at random, based on the overall frequency of MPC positions in the data. The positions in which MPCs were found to be significantly over- or under-represented are marked with an "*." (a) α helices; (b) β strands.
Fig. 6.
Fig. 6.
Distribution of mutually persistently conserved positions (white bars) by solvent accessibility compared to all aligned residues (black bars). Residues were defined as buried when the solvent accessibility was <30% and exposed otherwise.

Similar articles

See all similar articles

Cited by 18 articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback