Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Sep;23(18):3677-93.
doi: 10.1091/mbc.E12-01-0046. Epub 2012 Jul 25.

Sequence and Structural Analyses of Nuclear Export Signals in the NESdb Database

Affiliations
Free PMC article

Sequence and Structural Analyses of Nuclear Export Signals in the NESdb Database

Darui Xu et al. Mol Biol Cell. .
Free PMC article

Abstract

We compiled >200 nuclear export signal (NES)-containing CRM1 cargoes in a database named NESdb. We analyzed the sequences and three-dimensional structures of natural, experimentally identified NESs and of false-positive NESs that were generated from the database in order to identify properties that might distinguish the two groups of sequences. Analyses of amino acid frequencies, sequence logos, and agreement with existing NES consensus sequences revealed strong preferences for the Φ1-X(3)-Φ2-X(2)-Φ3-X-Φ4 pattern and for negatively charged amino acids in the nonhydrophobic positions of experimentally identified NESs but not of false positives. Strong preferences against certain hydrophobic amino acids in the hydrophobic positions were also revealed. These findings led to a new and more precise NES consensus. More important, three-dimensional structures are now available for 68 NESs within 56 different cargo proteins. Analyses of these structures showed that experimentally identified NESs are more likely than the false positives to adopt α-helical conformations that transition to loops at their C-termini and more likely to be surface accessible within their protein domains or be present in disordered or unobserved parts of the structures. Such distinguishing features for real NESs might be useful in future NES prediction efforts. Finally, we also tested CRM1-binding of 40 NESs that were found in the 56 structures. We found that 16 of the NES peptides did not bind CRM1, hence illustrating how NESs are easily misidentified.

Figures

FIGURE 1:
FIGURE 1:
Overview of CRM1 cargoes in NESdb. (A) The organisms from which CRM1 cargoes come. (B) The cellular localizations of CRM1 cargoes as defined by Gene Ontology (GO) annotations. (C) Cellular functions and biological processes in which CRM1 cargoes participate, as listed in their GO annotations.
FIGURE 2:
FIGURE 2:
Sequence logos of experimental vs. false-positive NESs. (A) Sequence logo of experimental NESs in NESdb. (B) Sequence logo of negative sequences that fit the traditional NES consensus patterns. (C) Sequence logo of negative sequences that fit the Kosugi NES consensus patterns. Sequence logos were generated by the program WebLogo (http://weblogo.berkeley.edu/), where the x-axis is labeled with amino acid position, with #15 as the last amino acid in the sequence, and the y-axis represents the information content measured in bits. The overall height of each stack of letters indicates the sequence conservation at that position, and the height of a letter within the stack indicates the relative frequency of the amino acid.
FIGURE 3:
FIGURE 3:
Position-specific amino acids frequencies of experimental vs. false-positive NESs. Amino acid frequency of (A) glutamate, (B) aspartate, and (C) tryptophan. Baseline refers to the background frequency of the amino acid.
FIGURE 4:
FIGURE 4:
Position-specific evolutionary conservation scores of experimental NESs vs. false positives that fit the newly proposed consensus. The evolutionary conservation scores were computed with the program AL2CO.
FIGURE 5:
FIGURE 5:
Examples of NESs in the PDB. (A) Crystal structures of NESs of snurportin-1 (PDB ID: 3GB8), HIV-1 Rev (3NBZ), and PKIα (3NBY) bound to CRM1. Cargo proteins are drawn as ribbon diagrams and their NESs colored pink, whereas the rest of the cargoes are colored from N- to C-termini in gradients of light to dark blue. CRM1 is shown in gray surface representation. (B) Examples of NESs that are located within 20 amino acids of the termini of protein domains: E7 of HPV16 (2EWL), MAPKK1 (2Y4I), and Gal3 (2XG3). The NESs of E7 and MAPKK1 are surface accessible, but the NES of Gal3 is not. (C) Examples of two NESs that are located far from protein termini: NESs of Yap1p (1SSE) and Tbx5 (2X6U). Both NESs shown here are flanked by long loops.
FIGURE 6:
FIGURE 6:
CRM1–NESs interactions. Direct interactions between recombinant purified GST-NESs and CRM1 are shown by pull-down binding assays. Immobilized GST-NESs of 40 CRM1 cargoes were incubated with CRM1 in the presence and absence of RanGTP. Bound proteins were resolved with SDS–PAGE and visualized by Coomassie staining. The 40 NESs were chosen because their 3D structures are available in the PDB. (A–C) CRM1 binders (except CHP1-2 and Topo2α-2). (D) A subset of the NESs that do not bind CRM1.
FIGURE 7:
FIGURE 7:
Examples of secondary structural elements adopted by NESs. (A) The NES of PLC-δ1 (1DJX) adopts a combined α-helix-loop conformation, similar to the secondary structure of the CRM1-bound SNUPN NES. (B) The NES of Stat1 (298KQVLWDRTFSLFQQL312; 1BF5) is entirely α-helical. (C) The NES of CDK5 (64VRLHDVLHSDKKLTL78; 1UNG) is part of a β-sheet. The color schemes used here are the same as in Figure 5.
FIGURE 8:
FIGURE 8:
Secondary structure elements (SSEs) of the NESs. (A) SSE composition of experimental NESs based on SSEs extracted from the 3D structures. (B) SSE composition of the false positives based on SSEs extracted from the 3D structures. (C) SSE composition of experimental NESs based on predicted SSEs. (D) SSE composition of the false positives based on predicted SSE. SSEs from 3D structures were extracted using the program DSSP (Kabsch and Sander, 1983), and SSE prediction was carried out using the program PSIPRED (Jones, 1999; Buchan et al. 2010).
FIGURE 9:
FIGURE 9:
Disorder propensities of the NESs. Predicted disorder scores of experimental NESs compared with those of the misidentified NESs and the false positives. Disorder scores were obtained for 50 residues preceding the NESs, the NESs, and 50 residues following the NESs. NES residues are found within the two vertical dotted lines. Disorder prediction was carried out by DISOPRED2 (Ward et al., 2004).

Similar articles

See all similar articles

Cited by 47 articles

See all "Cited by" articles

References

    1. Adamczak R, Porollo A, Meller J. Accurate prediction of solvent accessibility using neural networks-based regression. Proteins. 2004;56:753–767. - PubMed
    1. Adamczak R, Porollo A, Meller J. Combining prediction of secondary structure and solvent accessibility in proteins. Proteins. 2005;59:467–475. - PubMed
    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. - PubMed
    1. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. Ashburner M, et al. Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–29. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources

Feedback