Forcefields used in biomolecular simulations are comprised of energetic terms that are physical in nature, based on parameter fitting to quantum mechanical simulation or experimental data, or statistical, drawing off high-resolution structural data to describe distributions of molecular features. Combining the two in a single forcefield is challenging, since physical terms describe some, but not all, of the observed statistics, leading to double counting. In this manuscript, we develop a general scheme for correcting statistical potentials used in combination with physical terms. We apply these corrections to the sidechain torsional potential used in the Rosetta all-atom forcefield. We show the approach identifies instances of double-counted interactions, including electrostatic interactions between sidechain and nearby backbone, and steric interactions between neighboring Cβ atoms within secondary structural elements. Moreover, this scheme allows for the inclusion of intraresidue physical terms, previously turned off to avoid overlap with the statistical potential. Combined, these corrections lead to a forcefield with improved performance on several structure prediction tasks, including rotamer prediction and native structure discrimination.
Keywords: Rosetta; protein structure prediction; rotamer library; sidechain prediction; torsional potential.
© 2016 The Protein Society.