Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2009 Jan 1;14:900-17.
doi: 10.2741/3285.

Evolutionary and Biophysical Relationships Among the Papillomavirus E2 Proteins

Affiliations
Free PMC article
Review

Evolutionary and Biophysical Relationships Among the Papillomavirus E2 Proteins

Dukagjin M Blakaj et al. Front Biosci (Landmark Ed). .
Free PMC article

Abstract

Infection by human papillomavirus (HPV) may result in clinical conditions ranging from benign warts to invasive cancer. The HPV E2 protein represses oncoprotein transcription and is required for viral replication. HPV E2 binds to palindromic DNA sequences of highly conserved four base pair sequences flanking an identical length variable 'spacer'. E2 proteins directly contact the conserved but not the spacer DNA. Variation in naturally occurring spacer sequences results in differential protein affinity that is dependent on their sensitivity to the spacer DNA's unique conformational and/or dynamic properties. This article explores the biophysical character of this core viral protein with the goal of identifying characteristics that associated with risk of virally caused malignancy. The amino acid sequence, 3d structure and electrostatic features of the E2 protein DNA binding domain are highly conserved; specific interactions with DNA binding sites have also been conserved. In contrast, the E2 protein's transactivation domain does not have extensive surfaces of highly conserved residues. Rather, regions of high conservation are localized to small surface patches. Implications to cancer biology are discussed.

Figures

Figure 1
Figure 1
A) Schematic of the HPV16 E2 gene and known proteins of interaction. B) Ribbon diagram of the structure of the HPV16 E2 transactivation domain (31) .C) A ribbon representation of the structure of the HPV18 E2 DNA binding domain showing the dimeric protein bound to the DNA sequence ACCGAATTCGGT (PDB-ID: 1JJ4). The recognition helix alpha1 makes direct contact with the major groove of the DNA. The bases of the spacer region AATT are depicted.
Figure 2
Figure 2
A) Schematic of the papillomavirus Long Control Region (LCR) or Upstream Regulatory Region (URR) and E2 binding sites I-IV. B) The consensus E2 binding site derived from the 122 papillomavirus types were analyzed in this study. The frequency of the preferred base pairs at positions -3 and +3 (flanking percentages) and the occurrence of each pair at positions -3 and +3 is shown in the center (central percentages). DNA sequences within the upstream regulatory region (URR) of the papillomaviruses included in this study (Appendix 2) were analyzed for binding motifs containing the ACCN6GGT sequence. This template excludes binding sites in which there is a substitution in the highly conserved bases that participate in the direct interactions between the E2 protein and the DNA. The European Molecular Biology Open Software Suite package was used to locate motifs within a given sequence (117).
Figure 3
Figure 3
Structural surface representation of the amino acid sequence conservation using HPV16 E2 DNA binding domain as the template. Red denotes > 90% conservation, orange denotes a conservative substitution and blue denotes no conservation. A) The DNA binding surface of the DNA binding domain in surface fill view, B) 180 degree rotation showing the non-DNA contacted surface. The asterisk indicates a second conserved surface that is separate from the documented E2-E1 interface, and C) the dimer interface. The mucosal papillomaviruses are the most highly conserved in the DNA binding and dimerization surfaces. D) Row D shows the conservation of the E2 transactivation domain conservation using the structure of the domain of the HPV11 E2 protein in complex with a molecular inhibitor ((59); PDB ID: 1DTO). The inhibitor shown in green binds to pocket conserved throughout papillomavirus evolution. Note: The sequences used in this study were obtained from the sources summarized in Appendix 2. Grouping these sequences into (1) Other (animal E2 sequences), (2) beta genus and (3) and alpha genus was done as described in the text (1). Multiple sequence alignments were performed using the ClustalW program (118). The secondary structure assignments shown in Figure 5 were taken from structure of HPV16 E2/D (PDB ID: 1by9). Residue conservation calculations were performed using AMAS program (119). Briefly, AMAS is a program which performs a systematic characterization of the physical-chemical properties seen at each position in a multiple protein sequence alignment. A flexible set-based description of amino acid properties is used to define the conservation between any groups of amino acids.
Figure 4
Figure 4
A) Close-up of the structure of the HPV16 E2 DNA binding domain (PDB ID: 1BY9) highlighting the Gly residue in a loop preceding the DNA recognition helix that is absolutely conserved in all 122 papillomavirus types and 24 papillomavirus HPV16 variants that were analyzed. The picture shows the loop that contains Gly293 (red), the residues that are within 5 angstroms (Lys290, Asp292, Ala 293, Leu 296, Ala 329 and Ile 330) in magenta; B) Homology models that were generated after in silico mutagenesis of HPV 16 E2 Gly293 to Ala and Val, indicating the shift in the spatial position of the recognition helix. The wild type protein is colored green, the G293A mutant protein is colored red and the G293V mutant protein is colored blue. This view looks up at the recognition helices of the E2 protein dimer.
Figure 5
Figure 5
Summary of the generation of homology models (54) by group from the indicated templates for 122 papillomavirus types and the RMSD values obtained from comparison of the models. Comparative protein structure modeling of E2 sequences were performed using MODELLER (120, 121). Briefly, MODELLER perform a comparative modeling of sequences by satisfaction of spatial restraints inherited from a protein (s) of know structure (s), also known as template (s). Three templates were used to generate the modeled structures, BPV1E2/D (PDB ID: 1jjh), HVP16 and 18E2/D (PDB IDs: 1by9 and 1f9f), and HPV6 E2/D (PDB ID: 1r8h). The sequence identities between target sequences and templates were in the range of 40 - 80%, which is considered a ‘safe’ range of sequence identity where accurate models can be obtained. Five models were constructed for each sequence and the ones with best energy (according to MODELLER’s energy function) were kept. In addition, models were inspected using PROSAII (122) and PROCHECK (123) to further analyze their quality. Structural superposition of models was calculated with STAMP using only main chain atom coordinates (124).
Figure 6
Figure 6
A) A similarity index (125), with equivalent x and y axis, comparing the electrostatic potential of the DNA binding surface of E2 proteins from low and high HPVs (5). A value of one represents complete similarity while zero denotes no similarity; Protein interaction properties similarity analysis (PIPSA) (125) was used to compute the difference in electrostatic potential among structure models. PIPSA calculates the Hodgkin Similarity index (SI) that measures the similarity of two molecular potentials in sign, magnitude and spatial behavior. The SI index value ranges from -1 for anti-correlated index (i.e. opposite sign) to +1 for correlated potential. SI index were traditionally applied to measure electrostatic potential similarities between small molecules but it can be use also for protein using a grid approach. The electrostatic potentials were calculated using APBS program (126) and compared with PIPSA using the following parameters: regular cubic lattice with 65 angstrom dimensions, 1.5 angstrom spacing, two monovalent ionic species: -1 and +1 at 0.050 M, concentration, a dielectric constant of 78, and 298.15 K. Cubic lattice dimensions were manually inspected to ensure the complete immersion of proteins in the lattice. The cubic lattice was centered at the center of the models and only the DNA binding surface was used to compute the SI indexes. B) The surface potentials calculated for the HPV6 and HPV16 E2 proteins illustrating the high degree of similarity. These two surfaces only differ in a diffuse increase in the electropositive potential of the surface for the HPV16 E2 protein.
Figure 7
Figure 7
A) Annotation of the papillomavirus E2 gene including the N terminal transactivation domain and C terminal DNA binding domain that shows only the amino acid differences among the HPV16 variants. The overall the identity of HPV16 variant E2 genes is > 95%. The numbering corresponds to HPV16 E2 gene bank deposition. The numbering is not continuous and is represented along the bottom of the figure; B) The amino acid alignment of the HPV16 E2 prototype (E2pro) gene and 23 variants that depicts only the variable amino acid sequences. The amino acids that are uniquely conserved within the Non-European group are H35Q, T135K, H136Y, A143T, and R165Q in the transactivation domain and T310K, W341C and D344E in the DNA binding domain. Tr denotes a transitional epidemiological classification ‘European Asian’. Additional information about individual variants is in the Appendix 3. The numbers at the bottom of the figure represent the amino acid sequence number within the E2 gene (numbers are read from top to bottom).
Figure 8
Figure 8
A) A 90° rotation of the E2/D protein structure (frame of reference is Figure 2C) showing in red the mutations present in the Non-European variants (Figure 6). These mutations are on the surface of the protein that interacts with the E1 replication protein (57); B) Ribbon diagram of the structure of the HPV18 E2 transactivation domain (yellow) in complex with the E1 helicase domain (blue) (33). The structures of the HPV18 and HPV16 E2 transactivation domains are highly similar with an RMSD of ~ 1 Å (33). The mutations present in the proteins of the Non-European variants are shown in red (Figure 6). The E452D amino acid substitution present in Non-European HPV16 E1 protein variants is highlighted in magenta. No structural information is available for three other conserved amino acid variations within the E1 protein for the Non-European variants, Q78E, C168S and I326M, which are not present within this domain.
Figure 9
Figure 9
Summary of the equilibrium binding constants (Kd, nM) determined for in BPV1-E2/D (A & C) HPV16-E2/D and (B & D) binding to DNA containing the AATT, TTAA and ACGT spacer sequences. Panels A and B summarize binding in buffer containing 150 (black bars) and 250 mM (grey bars). The ionic strength of these solutions is 0.15 and 0.25, respectively. Panels C and D summarize binding in the presence of Mg2+: buffer containing 150 mM KCl (black bars), buffer containing 150 mM KCl and 10 mM MgCl2 (black crosshatched bars; ionic strength = 0.19 M), buffer containing 110 mM KCl and 10 mM MgCl2 (grey crosshatched bars; ionic strength = 0.15 M) (27). The data indicated the increased affinity with AT rich sequences as KCL concentration increase, and in the presence of MgCl2. Figure reprinted from reference (27) with permission from Elsevier.

Similar articles

See all similar articles

Cited by 9 articles

See all "Cited by" articles

Substances

Feedback