GEMME: a simple and fast global epistatic model predicting mutational effects

Mol Biol Evol. 2019 Aug 12;36(11):2604-2619. doi: 10.1093/molbev/msz179. Online ahead of print.


The systematic and accurate description of protein mutational landscapes is a question of utmost importance in biology, bioengineering and medicine. Recent progress has been achieved by leveraging on the increasing wealth of genomic data and by modelling inter-site dependencies within biological sequences. However, state-of-the-art methods remain time consuming. Here, we present GEMME (, an original and fast method that predicts mutational outcomes by explicitly modelling the evolutionary history of natural sequences. This allows accounting for all positions in a sequence when estimating the effect of a given mutation. GEMME uses only a few biologically meaningful and interpretable parameters. Assessed against 50 high- and low-throughput mutational experiments, it overall performs similarly or better than existing methods. It accurately predicts the mutational landscapes of a wide range of protein families, including viral ones and, more generally, of very conserved families. Given an input alignment, it generates the full mutational landscape of a protein in a matter of minutes. It is freely available as a package and a webserver at: