Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 7;30(7):925-933.e2.
doi: 10.1016/j.str.2022.04.005. Epub 2022 May 9.

The accuracy of protein structures in solution determined by AlphaFold and NMR

Affiliations

The accuracy of protein structures in solution determined by AlphaFold and NMR

Nicholas J Fowler et al. Structure. .

Abstract

In the recent Critical Assessment of Structure Prediction (CASP) competition, AlphaFold2 performed outstandingly. Its worst predictions were for nuclear magnetic resonance (NMR) structures, which has two alternative explanations: either the NMR structures were poor, implying that Alpha-Fold may be more accurate than NMR, or there is a genuine difference between crystal and solution structures. Here, we use the program Accuracy of NMR Structures Using RCI and Rigidity (ANSURR), which measures the accuracy of solution structures, and show that one of the NMR structures was indeed poor. We then compare Alpha-Fold predictions to NMR structures and show that Alpha-Fold tends to be more accurate than NMR ensembles. There are, however, some cases where the NMR ensembles are more accurate. These tend to be dynamic structures, where Alpha-Fold had low confidence. We suggest that Alpha-Fold could be used as the model for NMR-structure refinements and that Alpha-Fold structures validated by ANSURR may require no further refinement.

Keywords: ANSURR; NMR; alphafold; dynamics; protein structure.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
ANSURR scores for the three CASP14 NMR targets (A and B) Results for (A) all models and (B) ensemble averages are shown. NMR structures are in orange, AF2 models in blue, and all other predictions in gray. The green points shown for T1029 are scores for an NMR ensemble that was re-calculated after the CASP14 results were released and are discussed below. The NMR structure for T1055 (PDB: 6ZYC) has 20 models, and the NMR structure for T1027 (PDB: 7D2O) has 19 models. The original NMR structure for T1029 (PDB: 6UF2) has 10 models, and the recalculated structure (PDB: 7N82) has 20 models. Each group competing in CASP14 could provide up to five predictions. See also Figures S2–S4.
Figure 2
Figure 2
ANSURR analysis of T1027 (A and B) Blue lines show the rigidity as measured by RCI based on backbone chemical shifts (BMRB: 36288); orange lines show the rigidity (A) of the best-scoring NMR structure (model 11 from the ensemble) and (B) of the best-scoring AF2 model (model 3). Red bars at the top of each figure denote α-helical structure as assessed from the structure using DSSP, and blue bars denote β-sheet. Regions characterized as ill-defined by CYRANGE are indicated in gray. See also Figure S1.
Figure 3
Figure 3
Frequency distribution for the difference in ANSURR score between the AF2 prediction and NMR structure Values are given as [AF2 score] - [NMR score] so that a positive difference indicates a better score for the AF2 prediction. Selection criteria are outlined in STAR Methods. (A) Comparison of AF2 to the averaged ANSURR score for the NMR ensemble. Mean difference is 28. (B) Comparison of AF2 to the single best NMR structure in the ensemble (the NMR structure with the best ANSURR score). Mean difference is 2. (C) Breakdown of the data in (A) by protein secondary structure classification as determined by DSSP using proteins classified as α-helical, β-sheet, or mixed α/β.
Figure 4
Figure 4
Representative ANSURR output for two proteins where the AF2 model is more accurate than the NMR structure Each panel shows the rigidity from chemical shifts in blue and the structure rigidity in orange. The colored bars at the top of each plot indicate regions of regular secondary structure: α-helix (red) and β-sheet (blue). The structures are shown beside each plot in cartoon representation, with backbone hydrogen bonds depicted as gray lines. (A and B) Twentieth Filamin domain from human Filamin-B. (A) is the NMR structure (PDB: 2DLG, model 19) and (B) is the AF2 model (UniProt: O75369). (C and D) The zinc-finger BED domain of the zinc-finger BED-domain-containing protein 1. (C) is the NMR structure (PDB: 2CT5, model 3) and (D) is the AF2 model (UniProt: O96006).
Figure 5
Figure 5
Representative ANSURR output for two proteins where the NMR structure is better than the AF2 model Color scheme as for Figure 4. The structures are shown beside each plot in cartoon representation, with backbone hydrogen bonds depicted as gray lines. (A and B) EF-hand domain of human polycystin 2. (A) is the NMR structure (PDB: 2Y4Q, model 3) and (B) is the AF2 structure (UniProt: Q13563). (C and D) Transmembrane and juxtamembrane domains of epidermal growth factor receptor in dodecylphosphocholine (DPC) micelles. (C) is the NMR structure (PDB: 2N5S, model 2), and (D) is the AF2 structure (UniProt: P00533). See also Figures S5 and S6.
Figure 6
Figure 6
A comparison of pLDDT scores with ANSURR scores (A) The mean pLDDT score averaged over all amino acids for each AF2 model. Statistics are shown for all AF2 models in the test set and separately for the n = 273 structures in which the AF2 structure is significantly better than the NMR structure, and for the n = 22 structures in which the NMR structure is significantly better than the AF2 structure. The mean pLDDT score is shown below each box. (B) Correlation plot for mean pLDDT scores versus ANSURR scores for each AF2 model in the test set. The orange line is the line of best fit. Pearson’s r and the corresponding two-tailed p value are given in the legend. (C) Correlation plot for mean pLDDT scores computed for well-defined regions versus ANSURR scores for each AF2 model in the test set.

Similar articles

Cited by

References

    1. Abaturov L.V., Nosova N.G. Crystallographic and NMR spectroscopic protein structures: interresidue contacts. Mol. Biol. 2012;46:287–303. doi: 10.1134/s0026893312020021. - DOI - PubMed
    1. Alexander L.T., Lepore R., Kryshtafovych A., Adamopoulos A., Alahuhta M., Arvin A.M., Bomble Y.J., Bottcher B., Breyton C., Chiarini V., et al. Target highlights in CASP14: analysis of models by structure providers. Proteins Struct. Funct. Bioinf. 2021;89:1647–1672. doi: 10.1002/prot.26247. - DOI - PMC - PubMed
    1. Andrec M., Snyder D.A., Zhou Z.Y., Young J., Montelione G.T., Levy R.M. A large data set comparison of protein structures determined by crystallography and NMR: statistical test for structural differences and the effect of crystal packing. Proteins. 2007;69:449–465. doi: 10.1002/prot.21507. - DOI - PubMed
    1. Berjanskii M.V., Wishart D.S. Application of the random coil index to studying protein flexibility. J. Biomol. NMR. 2008;40:31–48. doi: 10.1007/s10858-007-9208-0. - DOI - PubMed
    1. Billeter M. Comparison of protein structures determined by NMR in solution and by X-ray diffraction in single crystals. Q. Rev. Biophys. 1992;25:325–377. doi: 10.1017/s0033583500004261. - DOI - PubMed

Publication types

MeSH terms