Revisiting the "satisfaction of spatial restraints" approach of MODELLER for protein homology modeling

PLoS Comput Biol. 2019 Dec 17;15(12):e1007219. doi: 10.1371/journal.pcbi.1007219. eCollection 2019 Dec.

Abstract

The most frequently used approach for protein structure prediction is currently homology modeling. The 3D model building phase of this methodology is critical for obtaining an accurate and biologically useful prediction. The most widely employed tool to perform this task is MODELLER. This program implements the "modeling by satisfaction of spatial restraints" strategy and its core algorithm has not been altered significantly since the early 1990s. In this work, we have explored the idea of modifying MODELLER with two effective, yet computationally light strategies to improve its 3D modeling performance. Firstly, we have investigated how the level of accuracy in the estimation of structural variability between a target protein and its templates in the form of σ values profoundly influences 3D modeling. We show that the σ values produced by MODELLER are on average weakly correlated to the true level of structural divergence between target-template pairs and that increasing this correlation greatly improves the program's predictions, especially in multiple-template modeling. Secondly, we have inquired into how the incorporation of statistical potential terms (such as the DOPE potential) in the MODELLER's objective function impacts positively 3D modeling quality by providing a small but consistent improvement in metrics such as GDT-HA and lDDT and a large increase in stereochemical quality. Python modules to harness this second strategy are freely available at https://github.com/pymodproject/altmod. In summary, we show that there is a large room for improving MODELLER in terms of 3D modeling quality and we propose strategies that could be pursued in order to further increase its performance.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology
  • Models, Molecular*
  • Molecular Dynamics Simulation / statistics & numerical data
  • Proteins / chemistry
  • Sequence Alignment / statistics & numerical data
  • Software*
  • Structural Homology, Protein*

Substances

  • Proteins

Grants and funding

GJ and AP received support from Associazione Italiana Ricerca sul Cancro (AIRC, https://www.airc.it/) MFAG 20447 and Progetti Ateneo Sapienza University of Rome (https://www.uniroma1.it). GG received support from Associazione Italiana Ricerca sul Cancro (AIRC, https://www.airc.it/) IG Grant 17390. AP, AG and GJ acknowledge the CINECA award under the ISCRA initiative, for the availability of high performance computing resources and support (IsC68_altmod). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.