Persistently conserved positions in structurally similar, sequence dissimilar proteins: roles in preserving protein fold and function

Protein Sci. 2002 Feb;11(2):350-60. doi: 10.1110/ps.18602.

Abstract

Many protein pairs that share the same fold do not have any detectable sequence similarity, providing a valuable source of information for studying sequence-structure relationship. In this study, we use a stringent data set of structurally similar, sequence-dissimilar protein pairs to characterize residues that may play a role in the determination of protein structure and/or function. For each protein in the database, we identify amino-acid positions that show residue conservation within both close and distant family members. These positions are termed "persistently conserved". We then proceed to determine the "mutually" persistently conserved (MPC) positions: those structurally aligned positions in a protein pair that are persistently conserved in both pair mates. Because of their intra- and interfamily conservation, these positions are good candidates for determining protein fold and function. We find that 45% of the persistently conserved positions are mutually conserved. A significant fraction of them are located in critical positions for secondary structure determination, they are mostly buried, and many of them form spatial clusters within their protein structures. A substitution matrix based on the subset of MPC positions shows two distinct characteristics: (i) it is different from other available matrices, even those that are derived from structural alignments; (ii) its relative entropy is high, emphasizing the special residue restrictions imposed on these positions. Such a substitution matrix should be valuable for protein design experiments.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Motifs
  • Animals
  • Databases, Factual
  • Fungal Proteins
  • Humans
  • Hydrolases / chemistry*
  • Lipase / chemistry*
  • Peptides
  • Protein Conformation
  • Protein Folding
  • Proteins / analysis
  • Proteins / chemistry*
  • Proteins / classification
  • Proteins / genetics
  • Sequence Alignment / methods*
  • Solvents
  • Xanthobacter / enzymology

Substances

  • Fungal Proteins
  • Peptides
  • Proteins
  • Solvents
  • Hydrolases
  • Lipase
  • lipase B, Candida antarctica
  • haloalkane dehalogenase