Background: Membrane proteins constitute up to 30% of the human proteome. These proteins have special properties because the transmembrane segments are embedded into lipid bilayer while extramembranous parts are in different environments. Membrane proteins have several functions and are involved in numerous diseases. A large number of prediction methods have been introduced to predict protein subcellular localization as well as the tolerance or pathogenicity of amino acid substitutions.
Results: We tested the performance of 22 tolerance predictors by collecting information on membrane proteins and variants in them. The analysis indicated that the best tools had similar prediction performance on transmembrane, inside and outside regions of transmembrane proteins and comparable to overall prediction performances for all types of proteins. PON-P2 had the highest performance followed by REVEL, MetaSVM and VEST3. Further, we tested with the high quality dataset also the performance of seven subcellular localization predictors on membrane proteins. We assessed separately the performance for single pass and multi pass membrane proteins. Predictions for multi pass proteins were more reliable than those for single pass proteins.
Conclusions: The predictors for variant effects had better performance than subcellular localization tools. The best tolerance predictors are highly reliable. As there are large differences in the performances of tools, end-users have to be cautious in method selection.
Keywords: Benchmark; Benchmarking; Disease-causing variant; Membrane protein; Method performance; Mutation; Variation interpretation.