Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 1 (5), e49

Structural Evolution of the Protein Kinase-Like Superfamily

Affiliations

Structural Evolution of the Protein Kinase-Like Superfamily

Eric D Scheeff et al. PLoS Comput Biol.

Abstract

The protein kinase family is large and important, but it is only one family in a larger superfamily of homologous kinases that phosphorylate a variety of substrates and play important roles in all three superkingdoms of life. We used a carefully constructed structural alignment of selected kinases as the basis for a study of the structural evolution of the protein kinase-like superfamily. The comparison of structures revealed a "universal core" domain consisting only of regions required for ATP binding and the phosphotransfer reaction. Remarkably, even within the universal core some kinase structures display notable changes, while still retaining essential activity. Hence, the protein kinase-like superfamily has undergone substantial structural and sequence revision over long evolutionary timescales. We constructed a phylogenetic tree for the superfamily using a novel approach that allowed for the combination of sequence and structure information into a unified quantitative analysis. When considered against the backdrop of species distribution and other metrics, our tree provides a compelling scenario for the development of the various kinase families from a shared common ancestor. We propose that most of the so-called "atypical kinases" are not intermittently derived from protein kinases, but rather diverged early in evolution to form a distinct phyletic group. Within the atypical kinases, the aminoglycoside and choline kinase families appear to share the closest relationship. These two families in turn appear to be the most closely related to the protein kinase family. In addition, our analysis suggests that the actin-fragmin kinase, an atypical protein kinase, is more closely related to the phosphoinositide-3 kinase family than to the protein kinase family. The two most divergent families, alpha-kinases and phosphatidylinositol phosphate kinases (PIPKs), appear to have distinct evolutionary histories. While the PIPKs probably have an evolutionary relationship with the rest of the kinase superfamily, the relationship appears to be very distant (and perhaps indirect). Conversely, the alpha-kinases appear to be an exception to the scenario of early divergence for the atypical kinases: they apparently arose relatively recently in eukaryotes. We present possible scenarios for the derivation of the alpha-kinases from an extant kinase fold.

Conflict of interest statement

Competing interests. The co-author of this manuscript is the editor-in-chief of PLoS Computational Biology.

Figures

Figure 1
Figure 1. Two Views of the Structure of PKA [70]
The structure consists of two subdomains: a small, primarily β-sheet N-terminal subdomain, and a larger, primarily helical C-terminal subdomain. ATP and metal ions are bound in the cleft between the two subdomains. The small left-side view depicts PKA in the “standard” orientation used by the authors when the structure was initially solved [12], and in many subsequent publications. The larger view on the right side depicts PKA in an “open-book” format that makes structural features in the two subdomains easier to compare between families. The open-book view is achieved by rotating the standard view 90° about the vertical axis, then splitting the two subdomains at the linker region and rotating each 90° in opposite directions about the horizontal axis. Helical secondary structures (both α-helices and 3–10 helices) are depicted as cylinders, and β-strands are depicted as arrows. Elements are labeled according to the standard conventions for PKA. Some secondary structure (particularly 3–10 helices) is not labeled in the standard PKA convention, and so is unlabeled here. One structure (Helix 1) was named by us (see text). Underlined labels belong to helical structures; non-underlined labels belong to β-strands. Secondary structure elements are colored according to their conservation status in the overall superfamily as follows: yellow, elements are part of the “universal core” seen in all kinases in the superfamily; orange, elements are present in more than two, but not all, of the kinases in the superfamily; purple, elements seen only in this family, but inserted within in the portion of the chain forming the universal core; blue, elements seen only in this family, and connected to the N- or C-terminal ends of the universal core. A bound pseudosubstrate inhibitor (PKI) is present in the structure [12], and depicted in gray. This inhibitor likely describes the binding location of actual substrates of PKA. The bound ATP molecule is rendered as a ball-and-stick model, while the bound Mg ions are rendered as gray spheres. The ATP and Mg ions are duplicated in mirror image and shown interacting with both the N- and C- terminal subdomains in the open-book rendering. The most critical and highly conserved residues in PKA (and the broader superfamily) are shown as ball-and-stick models in green, and labeled according to the standard PKA numbering scheme. In addition, the glycine-rich loop is also depicted in green, though individual glycine residues are not shown. The loop that forms the linker region between the subdomains is depicted in red. Other loops within the universal core are shown in white, except for loops linking purple regions (which are shown in purple), and loops outside of the universal core (shown in blue). Key loops described extensively in the text are labeled. For increased clarity, residues 300–350 have been removed from the C-terminus of PKA. This loop region is unique to PKA, and would have been colored blue if present in the figure. Molecular renderings in this figure were created with MOLSCRIPT [90].
Figure 2
Figure 2. Views of Structural Representatives from Six Families in the Kinase-Like Superfamily Other Than the TPKs
Structures are shown in an open-face view, and using the same conventions as used for PKA in Figure 1. ATP and metal ions are shown in mirror image where available in the structure. Similar to Figure 1, secondary structural elements are colored according to their conservation status in the overall superfamily as follows: yellow, elements are part of the “universal core” seen in all kinases in the superfamily; orange, elements are present in more than two, but not all, of the kinases in the superfamily; red, elements shared between only two families; purple, elements seen only in this family, but inserted within in the portion of the chain forming the universal core; blue, elements seen only in this family, and connected to the N- or C-terminal ends of the universal core. Secondary structural elements are labeled according to the standard conventions for the individual structure. As in Figure 1, the glycine-rich loop is rendered in green and the loop forming the linker region is rendered in red. For clarity, the conserved residues shown in Figure 1 are not rendered in these structures, though in most cases they are similar. Structures shown are as follows: (A) aminoglycoside phosphotransferase (APH(3′)-IIIa [24]); (B) CK (CKA-2 [23]); (C) ChaK [20]; (D) PI3K [21]; (E) AFK [22]; and (F) PIPKIIβ [19]. Molecular renderings in this figure were created with MOLSCRIPT [90].
Figure 3
Figure 3. Enhanced Sequence Alignment Derived from the Structural Alignment of Kinase Representatives
Abbreviated names of kinase representatives are provided with the gray box at the left-hand side of the figure (see Table 1 for more information on structures). The name is followed by the PDB ID [18] for the structure used in the alignment. The number in parenthesis following the PDB ID is the residue number of the first residue shown in the alignment. The sequences of the six AKs are clustered at the top of the alignment, followed by the sequence of PKA, which is highlighted. The alignment is annotated for key structural features using the JOY software [78]. Secondary structure is represented using the following conventions: light-gray box, β-strand; medium-gray box, 3–10 helix; dark-gray box, α-helix. Residue characteristics are represented using the following conventions: uppercase, solvent inaccessible; lowercase, solvent accessible; bold, hydrogen bond to main chain amide; underline, hydrogen bond to main chain carbonyl; tilde, hydrogen bond to other side-chain; italic, positive Φ; breve, cis-peptide. Residues that are highly conserved within the TPK family and some AKs are highlighted in boxes for the sequences where the conservation applies. The residue(s) seen at these positions are shown in uppercase above the boxes. The letter O stands for general hydrophobicity, but not a specific residue type. Residues that are more weakly conserved in the TPKs but are also conserved in many other AK families are noted with a lowercase letter above the appropriate alignment columns. Selected residues of interest that are conserved only within the TPKs are depicted using the same conventions above, but with gray lettering (depiction of residues conserved only in the TPKs is not exhaustive, i.e., only residues discussed in the text are highlighted above the alignment. Generally, this is done in structural regions unique to the TPKs). Secondary structures are labeled with the nomenclature used for PKA [12]. Sequence representing unresolved portions of the structure is not shown by JOY. In key portions of the alignment, this sequence is added back in and shown in light gray.
Figure 3 (continued)
Figure 3 (continued). Enhanced Sequence Alignment Derived from the Structural Alignment of Kinase Representatives
Figure 3 (continued)
Figure 3 (continued). Enhanced Sequence Alignment Derived from the Structural Alignment of Kinase Representatives
Figure 3 (continued)
Figure 3 (continued). Enhanced Sequence Alignment Derived from the Structural Alignment of Kinase Representatives
Figure 4
Figure 4. Proposed Phylogeny for the Kinase-Like Superfamily, Based on a Unified Bayesian Analysis of Both the Sequence Alignment in Figure 3 and the Structural Character Matrix in Table 2
Structures are labeled by their PDB IDs, followed by the abbreviated name of the structure. TPKs are to the left of the figure, and are labeled with their group membership. TPKs labeled with a black asterisk are classified differently in our tree compared with the classification produced by Manning et al. [7]. The AKs are highlighted with an orange oval. Major branches are labeled with their posterior probabilities. Gray ovals represent areas of doubt in the tree, based on the tree itself and other aspects of our analysis (see text). The left-hand oval represents uncertainty as to the closest TPK relative to the AKs; it is unclear where precisely the AKs should link to the TPKs (note that this uncertainty does not include the branching of most of the TPK groups in this region, as these are generally well supported). The right-hand oval represents uncertainty as to the proper placement of ChaK and PIPKIIβ. These kinases are difficult to place with high confidence because of their extreme divergence. They are labeled with red asterisks to denote the speculative nature of the current placement (see text).
Figure 5
Figure 5. Shared Hydrogen-Bonding Networks between Distantly Related Structures in the Kinase-Like Superfamily
Colors and nomenclature for secondary structural elements are identical to those provided in Figure 2. Structures shown are the C-terminal subdomains of four structures: (A) PKA [70]; (B) CKA-2 [23]; (C) PI3K [21]; and (D) AFK [22]. For clarity, some portions of structures are omitted. Residues involved in the shared hydrogen-bond networks are shown in a ball-and-stick rendering. For clarity, side-chains are omitted for residues that only participate in the network via backbone interactions. Residues involved directly in catalysis or metal binding are shown with light-green stick regions in the ball-and-stick rendering. Metal atoms, when present, are shown as gray spheres. ATP (or ATP analog), when present, is shown in a line rendering. Hydrogen bonds are shown in cyan. The orientation of the structures is similar but not identical (structures were rotated somewhat to make H-bond contacts more visible). Molecular renderings in this figure were created with MOLSCRIPT [90].
Figure 6
Figure 6. Conventional Distance-Based Phylogenetic Tree of the Kinase-Like Superfamily, Based Only on the Sequence Alignment from Figure 3
This tree did not explicitly incorporate structural information, and is provided for purposes of comparison with the Bayesian tree presented in Figure 4. Structures are labeled by their PDB IDs, followed by the abbreviated name of the structure. The AKs are highlighted by orange ovals. Bootstrap values are provided for major branches. Some branches are too short for values to fit; these are marked with red letters that correspond to the following values: a, 199; b, 170; c, 101; d, 141. Branches highlighted in gray were not supported by bootstrap values above 500, and should be considered speculative (if based only on this tree data) [57,58]. Many of the core relationships within the superfamily cannot be resolved with confidence using the conventional sequence-based approach.

Similar articles

See all similar articles

Cited by 104 PubMed Central articles

See all "Cited by" articles

References

    1. Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP: A structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995;247:536–540. - PubMed
    1. Muller A, MacCallum RM, Sternberg MJ. Benchmarking PSI-BLAST in genome annotation. J Mol Biol. 1999;293:1257–1271. - PubMed
    1. Chothia C, Lesk AM. The relation between the divergence of sequence and structure in proteins. Embo J. 1986;5:823–826. - PMC - PubMed
    1. Hon WC, McKay GA, Thompson PR, Sweet RM, Yang DS, et al. Structure of an enzyme required for aminoglycoside antibiotic resistance reveals homology to eukaryotic protein kinases. Cell. 1997;89:887–895. - PubMed
    1. Hanks SK. Genomic analysis of the eukaryotic protein kinase superfamily: A perspective. Genome Biol. 2003;4:111. - PMC - PubMed

Publication types

Feedback