Background: Protein sites evolve at different rates due to functional and biophysical constraints. It is usually considered that the main structural determinant of a site's rate of evolution is its Relative Solvent Accessibility (RSA). However, a recent comparative study has shown that the main structural determinant is the site's Local Packing Density (LPD). LPD is related with dynamical flexibility, which has also been shown to correlate with sequence variability. Our purpose is to investigate the mechanism that connects a site's LPD with its rate of evolution.
Results: We consider two models: an empirical Flexibility Model and a mechanistic Stress Model. The Flexibility Model postulates a linear increase of site-specific rate of evolution with dynamical flexibility. The Stress Model, introduced here, models mutations as random perturbations of the protein's potential energy landscape, for which we use simple Elastic Network Models (ENMs). To account for natural selection we assume a single active conformation and use basic statistical physics to derive a linear relationship between site-specific evolutionary rates and the local stress of the mutant's active conformation.We compare both models on a large and diverse dataset of enzymes. In a protein-by-protein study we found that the Stress Model outperforms the Flexibility Model for most proteins. Pooling all proteins together we show that the Stress Model is strongly supported by the total weight of evidence. Moreover, it accounts for the observed nonlinear dependence of sequence variability on flexibility. Finally, when mutational stress is controlled for, there is very little remaining correlation between sequence variability and dynamical flexibility.
Conclusions: We developed a mechanistic Stress Model of evolution according to which the rate of evolution of a site is predicted to depend linearly on the local mutational stress of the active conformation. Such local stress is proportional to LPD, so that this model explains the relationship between LPD and evolutionary rate. Moreover, the model also accounts for the nonlinear dependence between evolutionary rate and dynamical flexibility.