Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 10 (6), 845-58

The Phyre2 Web Portal for Protein Modeling, Prediction and Analysis

Affiliations

The Phyre2 Web Portal for Protein Modeling, Prediction and Analysis

Lawrence A Kelley et al. Nat Protoc.

Abstract

Phyre2 is a suite of tools available on the web to predict and analyze protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a paper in Nature Protocols. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites and analyze the effect of amino acid variants (e.g., nonsynonymous SNPs (nsSNPs)) for a user's protein sequence. Users are guided through results by a simple interface at a level of detail they determine. This protocol will guide users from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins that are difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2. A typical structure prediction will be returned between 30 min and 2 h after submission.

Conflict of interest statement

MJES is a Director and shareholder in Equinox Pharma Ltd which uses bioinformatics and chemoinformatics in drug discovery research and services.

Figures

Figure 1
Figure 1
Normal mode Phyre2 pipeline showing algorithmic stages. Stage numbers are shown in circles and elements within a stage are surrounded by a dashed box. Stage 1 (gathering homologous sequences): A query sequence is scanned against the specially curated nr20 (no sequences with >20% mutual sequence identity) protein sequence database with HHblits. The resulting multiple sequence alignment is used to predict secondary structure with PSI-pred and both the alignment and secondary structure prediction combined into a query hidden Markov model. Stage 2 (Fold library scanning): This is scanned against a database of HMMs of proteins of known structure. The top scoring alignments from this search are used to construct crude backbone-only models. Stage 3 (loop modelling): Insertions and deletions in these models are corrected by loop modelling. Stage 4 (Side chain placement): Finally amino acid side chains are added to generate the final Phyre2 model.
Figure 2
Figure 2
Intensive mode Phyre2 pipeline. Once a set of models has been generated as shown in stages 1-3 of Figure 1, models are chosen by heuristics to maximise both confidence and coverage of the query sequence. Pairwise Cα-Cα distances are extracted from these models and treated as linear inelastic springs in Poing. Regions not covered by templates are handled by the ab initio components of the Poing algorithm: preferentially bombardment of hydrophobic residues by notional solvent molecules to encourage burial, predicted secondary structure springs to maintain alpha helix or beta strand conformations, and prevention of steric clash. The new protein is ‘synthesised’ from a virtual ribosome in the context of these forces and the final Cα structure is used to construct a full backbone using Pulchra followed by sidechain addition using R3.
Figure 3
Figure 3
Phyre Investigator user interface. a. information box, b. structure and analyses view, c. sequence view. The structure and analyses view shows an interactive 3D JSmol viewer, buttons to toggle different analyses and two bar graphs, in this case for residue A34, showing the sequence profile preferences and predicted likelihood of a phenotypic effect from each of the 20 possible mutations at this position.
Figure 4
Figure 4
Example Phyre2 summary results page. On the left is an image of a large all-beta structure. Clicking on the image will download a PDB formatted file containing this structure. On the right are various data regarding the model including: PDB code of the template used, information about the protein template extracted from the PDB file, confidence in the model and coverage of the query sequence (100% and 28% respectively). In this case there is additional text informing the user that although only 28% of the query could be modelled by a single template, other high confidence templates were also detected that could increase this coverage to 55% by using Phyre’s intensive mode. Finally there is a link to launch the JSmol 3D viewer in the browser and a link to a FAQ describing popular external molecular viewing software.
Figure 5
Figure 5
Samples of the three main sections of a typical Phyre2 results page. The sections are labelled a-c and discussed below. a. Example secondary structure and disorder prediction. The query sequence is coloured as described in Step 17. Question marks indicate predicted disordered regions. Each type of prediction is associated with a rainbow colour-coded confidence (red: highest confidence, blue: lowest confidence) b. Example of the domain analysis results section described in Steps 20-22. The width of the box indicates the length of the query sequence. In this example confident (red) matches have been found at the N-terminus (rank 6) and the C-terminus (ranks 1-5) but no confident matches have been found to the intervening segment. c. Example of the detailed table of results described in Steps 23-24, and 29-32. In this example, the rank 1 and 2 matches have confidence of 100% and sequence identities of 23 and 24% respectively.
Figure 6
Figure 6
Example alignment between user query sequence and known structure, as described in Steps 25-28. Sequence colouring is as described in Step 17. Identical residues between query and template have a grey background. Secondary structures (predicted and known) are displayed; in this case alpha helices. Colour-coded per-residue confidence in both the alignment (from HHsearch) and in secondary structure prediction is displayed. The level of residue conservation for both the query and template sequences is also shown where thicker horizontal bars indicate greater degrees of conservation.

Similar articles

See all similar articles

Cited by 1,787 PubMed Central articles

See all "Cited by" articles

Publication types

Feedback