Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2001 Feb;10(2):285-92.
doi: 10.1110/ps.31901.

Sialidase-like Asp-boxes: Sequence-Similar Structures Within Different Protein Folds

Affiliations
Free PMC article

Sialidase-like Asp-boxes: Sequence-Similar Structures Within Different Protein Folds

R R Copley et al. Protein Sci. .
Free PMC article

Abstract

Sequence similarity is the most common measure currently used to infer homology between proteins. Typically, homologous protein domains show sequence similarity over their entire lengths. Here we identify Asp box motifs, initially found as repeats in sialidases and neuraminidases, in new structural and sequence contexts. These motifs represent significantly similar sequences, localized to beta hairpins within proteins that are otherwise different in sequence and three-dimensional structure. By performing a combined sequence- and structure-based analysis we detect Asp boxes in more than nine protein families, including bacterial ribonucleases, sulfite oxidases, reelin, netrins, some lipoprotein receptors, and a variety of glycosyl hydrolases. Although the function common to each of these proteins, if any, remains unclear, we discuss possible functions of Asp boxes on the basis of previously determined experimental results and discuss different evolutionary scenarios for the origin of Asp-box containing proteins.

Figures

Fig. 1.
Fig. 1.
(a) Cα traces of Asp box structures shown in Table 1 represented using Molscript (Kraulis 1991). Only the best match from each of the β propellers is shown. PDB codes are as for Table 1. The side chain atoms of the core conserved residues are shown in ball and stick representation. The one letter amino acid codes and residue numbers are given. Water molecules found in equivalent locations in all structures are illustrated as red spheres. (b) Schematic representation of the location of Asp box motifs within different protein topologies. Amino and carboxy termini are labeled N and C, respectively. Arrows represent β strands, and cylinders α helices. Asp boxes identified in the structural search are boxed with dotted lines.
Fig. 1.
Fig. 1.
(a) Cα traces of Asp box structures shown in Table 1 represented using Molscript (Kraulis 1991). Only the best match from each of the β propellers is shown. PDB codes are as for Table 1. The side chain atoms of the core conserved residues are shown in ball and stick representation. The one letter amino acid codes and residue numbers are given. Water molecules found in equivalent locations in all structures are illustrated as red spheres. (b) Schematic representation of the location of Asp box motifs within different protein topologies. Amino and carboxy termini are labeled N and C, respectively. Arrows represent β strands, and cylinders α helices. Asp boxes identified in the structural search are boxed with dotted lines.
Fig. 2.
Fig. 2.
Multiple alignment of Asp box sequences. Only one of each family of Asp box sequence-containing proteins has been represented. The majority of Asp box sequences are 14 amino acids in length. However, relative to these, a single reelin Asp box contains a single amino acid insertion, and several other sequences contain a single amino acid deletion. This alignment has been colored using CHROMA (Leo Goodstadt and Chris P. Ponting, unpubl.) and an 80% consensus: Hydrophobic (`h'; ACFGHILMTVWY) residues are highlighted in yellow, conserved residues (>80%) are shown as yellow on black (S and T are treated as equivalent, as are F, W, and Y), big (`b'; EFIKLMQRWY) residues are blue on yellow, small (`s'; ACDGNPSTV) residues are in green, and polar (`p'; CDEHKNQRST) residues are in blue. The sequences shown are: CHB_SERMA, Serratia marcescens chitobiase (GenBank identifier [gi] 3023484); NANH_BACFR, Bacteroides fragilis sialidase (gi 400354); human reelin (gi 4760438); PEP1_YEAST, S. cerevisiae Vps10p (gi 417462); Salmonella typhimurium spi4K (gi 3323596); Avic_ASPAC, Aspergillus aculeatus Avicelase III (gi 3242655); UNC6_CAEEL, C. elegans Unc-6 (gi 465001); H136_ARATH, Arabidopsis thaliana photosystem II stability/assembly factor HCF136 (gi 6016183); FRUA_STRMU, Streptococcus mutans fructanase (gi 2500931); slr1403/SYNY3, Synechocystis sp. slr1403 (gi 1652714); bacteriophage #D endo-N-acetylneuraminidase (gi 3551474); ORF_MYXXA, Myxococcus xanthus ORF (gi 5690376); Ngluc_ENTSP, Enterobacter sp. N-acetyl-beta-D-glucosaminidase (gi 4204206); vrlC/DICNO, Dichelobacter nodosus vrlC (gi 3482864); SLRep/THETH, Thermus thermophilus S-layer repressor (gi 2104901); YkuO/BACSU, Bacillus subtilis YkuO (gi 2632236); APE1882_AERPE, Aeropyrum pernix APE1882 (gi 5105574); and, CSPr_LACDE, Lactobacillus delbrueckii subsp. bulgaricus cell surface proteinase (gi 2127379).

Similar articles

See all similar articles

Cited by 33 articles

See all "Cited by" articles

LinkOut - more resources

Feedback