Molprobity's ultimate rotamer-library distributions for model validation

Proteins. 2016 Sep;84(9):1177-89. doi: 10.1002/prot.25039. Epub 2016 Jun 23.


Here we describe the updated MolProbity rotamer-library distributions derived from an order-of-magnitude larger and more stringently quality-filtered dataset of about 8000 (vs. 500) protein chains, and we explain the resulting changes and improvements to model validation as seen by users. To include only side-chains with satisfactory justification for their given conformation, we added residue-specific filters for electron-density value and model-to-density fit. The combined new protocol retains a million residues of data, while cleaning up false-positive noise in the multi- χ datapoint distributions. It enables unambiguous characterization of conformational clusters nearly 1000-fold less frequent than the most common ones. We describe examples of local interactions that favor these rare conformations, including the role of authentic covalent bond-angle deviations in enabling presumably strained side-chain conformations. Further, along with favored and outlier, an allowed category (0.3-2.0% occurrence in reference data) has been added, analogous to Ramachandran validation categories. The new rotamer distributions are used for current rotamer validation in MolProbity and PHENIX, and for rotamer choice in PHENIX model-building and refinement. The multi-dimensional χ distributions and Top8000 reference dataset are freely available on GitHub. These rotamers are termed "ultimate" because data sampling and quality are now fully adequate for this task, and also because we believe the future of conformational validation should integrate side-chain with backbone criteria. Proteins 2016; 84:1177-1189. © 2016 Wiley Periodicals, Inc.

Keywords: Phenix; high-quality dataset; protein conformation; rare side-chain conformations; side-chain rotamer library; structural bioinformatics; structure validation.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Amino Acids / chemistry
  • Databases, Protein
  • Datasets as Topic
  • Electrons*
  • Peptide Library*
  • Protein Conformation
  • Proteins / chemistry*
  • Proteins / classification
  • Statistical Distributions
  • Thermodynamics


  • Amino Acids
  • Peptide Library
  • Proteins