Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 May 31;17(5):e3000301.
doi: 10.1371/journal.pbio.3000301. eCollection 2019 May.

Proteome-wide Analysis of Chaperone-Mediated Autophagy Targeting Motifs

Free PMC article

Proteome-wide Analysis of Chaperone-Mediated Autophagy Targeting Motifs

Philipp Kirchner et al. PLoS Biol. .
Free PMC article


Chaperone-mediated autophagy (CMA) contributes to the lysosomal degradation of a selective subset of proteins. Selectivity lies in the chaperone heat shock cognate 71 kDa protein (HSC70) recognizing a pentapeptide motif (KFERQ-like motif) in the protein sequence essential for subsequent targeting and degradation of CMA substrates in lysosomes. Interest in CMA is growing due to its recently identified regulatory roles in metabolism, differentiation, cell cycle, and its malfunctioning in aging and conditions such as cancer, neurodegeneration, or diabetes. Identification of the subset of the proteome amenable to CMA degradation could further expand our understanding of the pathophysiological relevance of this form of autophagy. To that effect, we have performed an in silico screen for KFERQ-like motifs across proteomes of several species. We have found that KFERQ-like motifs are more frequently located in solvent-exposed regions of proteins, and that the position of acidic and hydrophobic residues in the motif plays the most important role in motif construction. Cross-species comparison of proteomes revealed higher motif conservation in CMA-proficient species. The tools developed in this work have also allowed us to analyze the enrichment of motif-containing proteins in biological processes on an unprecedented scale and discover a previously unknown association between the type and combination of KFERQ-like motifs in proteins and their participation in specific biological processes. To facilitate further analysis by the scientific community, we have developed a free web-based resource (KFERQ finder) for direct identification of KFERQ-like motifs in any protein sequence. This resource will contribute to accelerating understanding of the physiological relevance of CMA.

Conflict of interest statement

The authors have declared that no competing interests exist based on the content of the submitted manuscript.


Fig 1
Fig 1. Frequency and types of KFERQ-like motifs in the human proteome.
(A) Scheme of the building rules of canonical, phosphorylation-, and acetylation-generated KFERQ-like motifs. (B) Percentage of proteins in the human proteome (filtered for reviewed entries) harboring the indicated classes of KFERQ-like motifs. Occurrence of motifs is ranked as canonical > phosphorylation-generated > acetylation-generated, and proteins with a combination of motifs are assigned to a group based on their highest-ranking motif. (C) Percentages of the reviewed human proteome harboring particular combinations of KFERQ-like motifs (generated by splitting the data from Fig 1B into all possible motif combinations). (D) Linear model for the correlation between the number of canonical motifs and protein length. The blue line represents the ordinary least squares regression with 95% confidence intervals (red area) using the relationship: log2(number of motifs) = protein length. Three very long proteins were removed as outliers (Cook’s distance > 1). R2 is the goodness-of-fit statistic for the fitted model. acetyl., acetylation; phosp, phosphorylation.
Fig 2
Fig 2. Distribution of KFERQ-like motifs within protein sequences.
(A) Distribution of canonical KFERQ-like motifs along the protein length (normalized to a scale from 0 [N-terminus] to 1 [C-terminus]). The histograms show the count of motifs at the relative position with a bin size of 0.02. (B) The first 10% (N-terminus; top) and last 10% (C-terminus; bottom) of the normalized protein length in Fig 2A, shown here with a bin size of 0.001. The C-terminal plot (bottom) is mirrored for easier comparison. The red line indicates the slope of the reduction in KFERQ-like motifs. (C, D) Bar plots showing the average of exposed amino acids, as predicted from the primary sequence, using JPred4 for proteins validated as CMA substrates (C) or proteins in the human proteome harboring one canonical motif (D). For each protein, a region ±30 amino acids around the central amino acid of the motifs was isolated and aligned on the KFERQ-like motifs. The percentage of exposed residues was then calculated for each position. The red line indicates the mean percentage of exposure for all amino acids in all investigated proteins. Amino acids that are part of the KFERQ-like motifs are highlighted in blue. (E-H) Examples of domain localization and experimentally confirmed PTMs in KFERQ-like motifs of DJ-1 (E), alpha-synuclein (F), CHK1 (G) and PLIN3 (H). Canonical motifs are marked as yellow bars, phosphorylation-generated in blue, and acetylation-generated in green. Protein structures were obtained from the RCSBPDB protein data bank ( using PBD IDs 1j42 (for DJ-1 [25]); 1XQ8, 2KKW, and 2N0A (for alpha-synuclein [26]); 4FSM (for Chk1 [27]); and 1SZI (for PLIN3 [28]). The structures of the KFERQ-like motifs are shown as strings and ribbons colored based on amino acid properties. PTMs shown: ubiquitylation (ub), phosphorylation (P), and oxidation (Ox). Arrows: location of the motif in the protein structure. The cartoon in (G) depicts the conformational change in Chk1 that releases autoinhibition of its catalytic activity. CHK1, checkpoint kinase 1; CMA, chaperone-mediated autophagy; Memb. Bind., Membrane Binding; Ox, oxidation; P, phosphorylation; PARK7, Parkinsonism associated deglycase; PAT, perilipin/ADRP/TIP47; PLIN3, perilipin 3; PTM, posttranslational modification; ub, ubiquitylation.
Fig 3
Fig 3. Amino acid positioning and frequencies within KFERQ-like motifs.
(A-C) Frequency of amino acids at the four variable positions in canonical (A), phosphorylation-generated (B), and acetylation-generated (C) motifs in the human proteome. To allow superimposition, all motifs were aligned with a downstream glutamine. The amino acid positions are given relative to the glutamine (−1 = closest and −4 = furthest away). For each amino acid, the counts at each position are normalized as the percentage of the sum of all four positions. The phosphorylation acceptors serine, threonine, and tyrosine (red) are classified as acidic because they appear as an acidic residue once phosphorylated. Red boxes highlight consistent changes in abundance across motif types (see text for details). (D) Frequency of amino acids grouped by biochemical properties (basic, hydrophobic, acidic) at the four variable positions. The groups are the same three type of KFERQ-like motifs as shown in Fig 3A–3C. (E) Comparison of amino acid frequencies at each position in canonical motifs from the human proteome and from a permutated proteome. Amino acid counts from A are divided by the counts in motifs from permutated proteins. Means are from 40 random samples of 10% of the data sets each. ***p < 0.001, **p < 0.01, *p < 0.05. The p-values from two-sided t tests are corrected (Bonferroni) by the number of comparisons (n = 32). hydroph., hydrophobic;
Fig 4
Fig 4. Conservation of KFERQ-like motifs and CMA components among species.
(A) Percentage of proteins with the indicated types of KFERQ-like motifs in the referenced proteomes of M. musculus, D. melanogaster and S. cerevisiae. Only reviewed Swiss-Prot entries are included. The occurrence of motifs is ranked as canonical > phosphorylation-generated > acetylation-generated, and proteins with a combination of motifs are assigned to a group based on their highest-ranking motif. (B) Scatterplot of the conservation of motifs from human proteins with a single canonical motif in orthologs from the list of species predicted to be able or unable to perform CMA based on detection of LAMP-2A (S4C Fig). Sequences are aligned in MUSCLE ( and motifs identified in the pentapeptides match the exact position of the human motif. The conservation score was calculated as follows: nconserved+0.5npartial-nnoOrthnspecies, where npartial = species with motifs of a different type and nnoOrth = species with no ortholog identified. A conservation score >0 indicates that it is more likely than not to find an ortholog with a motif at the same position as the human protein. (C) Conservation of CMA machinery across species. Proteins involved in CMA are grouped based on their function (effector and modulators) and localization (lysosomal and extra-lysosomal). The colored disk next to the name of each element represents the conservation between CMA-able and CMA-unable species, as indicated by the lateral color bar. Positive and negative symbols indicate their function as activators or inhibitors of CMA activity. AKT1, RAC-alpha serine/threonine-protein kinase; Cath A, lysosomal protective protein/cathepsin A; CMA, chaperone-mediated autophagy; eF1α, Elongation factor 1-alpha; GFAP, Glial fibrillary acidic protein; HSC70, Heat shock cognate 71 kDa protein; HSP40, DnaJ homolog subfamily B member 1; HSP90, Heat shock protein HSP 90; LAMP-2A, lysosome-associated membrane protein type 2A; NFAT, nuclear factor of activated T cells; NRF-2, nuclear factor erythroid 2-related factor 2; PHLPP1, PH domain leucine-rich repeat-containing protein phosphatase 1; Rab11, Ras-related protein Rab-11; RAC1, Ras-related C3 botulinum toxin substrate 1; RARα, Retinoic acid receptor alpha.
Fig 5
Fig 5. Enrichment of proteins with KFERQ-like motifs in biological processes.
(A) Enrichment map analysis of the association of proteins with KFERQ-like motifs with biological processes. The nodes are radial heat maps in which size shows the number of proteins within the given annotation and intensity of color indicates association with a specific kind of motif (the position of each type of motif in the radial heat map is indicated in the legend). Edges represent similarity between nodes, and color coded circles indicate clusters associated with a specific motif type (yellow for canonical, blue for phosphorylation-generated, and green for acetylation-generated motifs). (B,C) Enrichment for human proteins annotated under cellular components biogenesis (B) and under protein catabolic processes (C), grouped by relative content of KFERQ-like motif classes. For each protein, the fractional content (0% to 100%) of canonical, phosphorylation-, and acetylation-generated motifs is calculated. Proteins are binned by motif composition using a 5% bin size for each dimension. The combined score for enrichment (−loge(p-value)*z-score) of the proteins in each bin (small triangles within the plot area) is color coded from blue (low) to red (high). Acet., acetylation; Phosph., phosphorylation.

Similar articles

See all similar articles

Cited by 5 articles


    1. Yang Z, Klionsky DJ. Mammalian autophagy: core molecular machinery and signaling regulation. Curr Opin Cell Biol. 2010;22(2): 124–31. 10.1016/ - DOI - PMC - PubMed
    1. Dice JF, Chiang H-L, Spencer EP, Backer JM. Regulation of catabolism of microinjected ribonuclease A: Identification of residues 7–11 as the essential pentapeptide. J Biol Chem. 1986;262: 6853–9. - PubMed
    1. Cuervo AM, Dice JF. A receptor for the selective uptake and degradation of proteins by lysosomes. Science. 1996;273: 501–3. - PubMed
    1. Kaushik S, Cuervo AM. The coming of age of chaperone-mediated autophagy. Nat Rev Mol Cell Biol. 2018;19: 365–81. 10.1038/s41580-018-0001-6 - DOI - PMC - PubMed
    1. Koga H, Martinez-Vicente M, Macian F, Verkhusha VV, Cuervo AM. A photoconvertible fluorescent reporter to track chaperone-mediated autophagy. Nat Commun. 2011;2: 386 10.1038/ncomms1393 - DOI - PMC - PubMed

Publication types

LinkOut - more resources