Methods of model accuracy estimation can help selecting the best models from decoy sets: Assessment of model accuracy estimations in CASP11

Proteins. 2016 Sep;84 Suppl 1(Suppl 1):349-69. doi: 10.1002/prot.24919. Epub 2015 Sep 28.

Abstract

The article presents assessment of the model accuracy estimation methods participating in CASP11. The results of the assessment are expected to be useful to both-developers of the methods and users who way too often are presented with structural models without annotations of accuracy. The main emphasis is placed on the ability of techniques to identify the best models from among several available. Bivariate descriptive statistics and ROC analysis are used to additionally assess the overall correctness of the predicted model accuracy scores, the correlation between the predicted and observed accuracy of models, the effectiveness in distinguishing between good and bad models, the ability to discriminate between reliable and unreliable regions in models, and the accuracy of the coordinate error self-estimates. A rigid-body measure (GDT_TS) and three local-structure-based scores (LDDT, CADaa, and SphereGrinder) are used as reference measures for evaluating methods' performance. Consensus methods, taking advantage of the availability of several models for the same target protein, perform well on the majority of tasks. Methods that predict accuracy on the basis of a single model perform comparably to consensus methods in picking the best models and in the estimation of how accurate is the local structure. More groups than in previous experiments submitted reasonable error estimates of their own models, most likely in response to a recommendation from CASP and the increasing demand from users. Proteins 2016; 84(Suppl 1):349-369. © 2015 Wiley Periodicals, Inc.

Keywords: CASP; EMA; QA; estimation of model accuracy; model quality assessment; protein structure modeling; protein structure prediction.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Benchmarking*
  • Computational Biology / methods
  • Computational Biology / statistics & numerical data*
  • Humans
  • Internet
  • Models, Molecular*
  • Models, Statistical*
  • Protein Folding
  • Protein Structure, Secondary
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • ROC Curve
  • Software*
  • Thermodynamics

Substances

  • Proteins