Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Mar 1;76(Pt 3):248-260.
doi: 10.1107/S2059798320000455. Epub 2020 Feb 28.

The use of local structural similarity of distant homologues for crystallographic model building from a molecular-replacement solution

Affiliations
Free PMC article

The use of local structural similarity of distant homologues for crystallographic model building from a molecular-replacement solution

Grzegorz Chojnowski et al. Acta Crystallogr D Struct Biol. .
Free PMC article

Abstract

The performance of automated protein model building usually decreases with resolution, mainly owing to the lower information content of the experimental data. This calls for a more elaborate use of the available structural information about macromolecules. Here, a new method is presented that uses structural homologues to improve the quality of protein models automatically constructed using ARP/wARP. The method uses local structural similarity between deposited models and the model being built, and results in longer main-chain fragments that in turn can be more reliably docked to the protein sequence. The application of the homology-based model extension method to the example of a CFA synthase at 2.7 Å resolution resulted in a more complete model with almost all of the residues correctly built and docked to the sequence. The method was also evaluated on 1493 molecular-replacement solutions at a resolution of 4.0 Å and better that were submitted to the ARP/wARP web service for model building. A significant improvement in the completeness and sequence coverage of the built models has been observed.

Keywords: ARP/wARP; loop building; macromolecular crystallography; model building; sequence similarity.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The r.m.s.d. thresholds for selecting matched fragments. (a) Distribution of the r.m.s.d. for fragments of 20 residues in length. Alignments with local structural similarity (‘positives’) and those without (‘negatives’) are indicated. (b) The F 1 score as a function of the r.m.s.d. threshold for fragments of different lengths.
Figure 2
Figure 2
Schematic representation of the fragment-assembly algorithm. The graphs represent query fragments and aligned matching fragments (black and red, respectively; edge weights and directions are not shown for clarity). In the first step, graph nodes corresponding to the query fragments are merged with all remaining nodes within a distance of 1.0 Å (a). Next, the remaining nodes are merged with their neighbours within a distance of 1.0 Å in an arbitrary order (b). Finally, branching edges are removed (dashed line) (c).
Figure 3
Figure 3
(a) Distribution of the highest sequence-identity match of each unique sequence of the ARP/wARP web service model-building tasks (June 2017 to February 2019) to the protein structures already available in the PDB; (b) the corresponding cumulative distribution.
Figure 4
Figure 4
Improvement in model building for test set II at resolutions between 2.0 and 3.0 Å. (a) The fraction of residues built; (b) the sequence coverage. Box-plot whiskers correspond to the 5th and 95th percentiles.
Figure 5
Figure 5
Improvement in model building for test set II at resolutions better than 2.0 Å. (a) The fraction of residues built; (b) the sequence coverage. Box-plot whiskers correspond to the 5th and 95th percentiles.
Figure 6
Figure 6
The improvement of model-building results in the complete test set II as a function of the sequence identity to the closest available homologue. (a) Relative change in model completeness, (b) relative change in sequence coverage. Box-plot whiskers correspond to the 5th and 95th percentiles. The significance level of a one-sided Student’s t-test for the average improvement is marked above the boxes (ns, nonsignificant; p-values below 0.05, 0.01, 0.001 and 0.0001 are denoted with one to four stars, respectively).
Figure 7
Figure 7
ARP/wARP models of CFA synthase built at 2.7 Å resolution using default parameters. Parts of the models that were not assigned to the sequence are presented in black, while other chains are shown in red and green. The models were built (a) without homology-based extension and (b) with homology-based extension. (c) The closest homologue and the MR search model (PDB entry 3hem), shown in black, superposed onto the ARP/wARP model from (b), shown in grey.
Figure 8
Figure 8
Close-up views of the ARP/wARP model of CFA synthase built at 2.7 Å resolution using default parameters (red) with the superposed closest homologue (PDB entry 3hem, black): (a) core region of the protein with low sequence variability and well conserved structure, (b) solvent-exposed part where sequence and structure diverge (side chains are not shown for clarity). The final 2F oF c maps are contoured at the 1.5σ density level above the mean.
Figure 9
Figure 9
The comparison of the final R free for the models built by ARP/wARP with homology-based extension (without free atoms and following ARP/wARP solvent building) as a function of (a) the R free value for the initial MR solution, (b) the completeness of the MR model and (c) the fraction of the residues built.
Figure 10
Figure 10
The influence of the homology-based extension on the R free value for models built with ARP/wARP version 8.0 at resolutions (a) better than 2.0 Å, (b) between 2.0 and 3.0 Å and (c) below 3.0 Å. Box-plot whiskers correspond to the 5th and 95th percentiles.

Similar articles

Cited by

References

    1. Abergel, C. (2013). Acta Cryst. D69, 2167–2173. - PMC - PubMed
    1. Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D. J. (1997). Nucleic Acids Res. 25, 3389–3402. - PMC - PubMed
    1. Alva, V., Nam, S.-Z., Söding, J. & Lupas, A. N. (2016). Nucleic Acids Res. 44, W410–W415. - PMC - PubMed
    1. Alva, V., Söding, J. & Lupas, A. N. (2015). eLife, 4, e09410. - PMC - PubMed
    1. Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N. & Bourne, P. E. (2000). Nucleic Acids Res. 28, 235–242. - PMC - PubMed