GEMME: a simple and fast global epistatic model predicting mutational effects

Mol Biol Evol. 2019 Aug 12;36(11):2604-2619. doi: 10.1093/molbev/msz179. Online ahead of print.

Abstract

The systematic and accurate description of protein mutational landscapes is a question of utmost importance in biology, bioengineering and medicine. Recent progress has been achieved by leveraging on the increasing wealth of genomic data and by modelling inter-site dependencies within biological sequences. However, state-of-the-art methods remain time consuming. Here, we present GEMME (www.lcqb.upmc.fr/GEMME), an original and fast method that predicts mutational outcomes by explicitly modelling the evolutionary history of natural sequences. This allows accounting for all positions in a sequence when estimating the effect of a given mutation. GEMME uses only a few biologically meaningful and interpretable parameters. Assessed against 50 high- and low-throughput mutational experiments, it overall performs similarly or better than existing methods. It accurately predicts the mutational landscapes of a wide range of protein families, including viral ones and, more generally, of very conserved families. Given an input alignment, it generates the full mutational landscape of a protein in a matter of minutes. It is freely available as a package and a webserver at: www.lcqb.upmc.fr/GEMME/.