Many proteins contain intrinsically disordered regions, which may be crucial for function, but on the other hand be related to the pathogenicity of variants. Prediction programs have been developed to detect disordered regions from sequences and used to predict the consequences of variants, although their performance for this task has not been assessed. We tested the performance of protein disorder prediction programs in detecting changes to disorder caused by amino acid substitutions. We assessed the performance of 29 protein disorder predictors and versions with 101 amino acid substitutions, whose effects have been experimentally validated. Disorder predictors detected the true positives at most with 6% success rate and true negatives with 34% rate for variants. The corresponding rates for the wild-type forms are 7% and 90%, respectively. The analysis revealed that disorder programs cannot reliably predict the effects of substitutions; consequently, the tested methods, and possibly similar programs, cannot be recommended for variant analysis without other information indicating to the relevance of disorder. These results inspired us to develop a new method, PON-Diso (http://structure.bmc.lu.se/PON-Diso), for disorder-related amino acid substitutions. With 50% success rate for independent test set and 70.5% rate in cross-validation, it outperforms the evaluated methods.
Keywords: amino acid substitution; bioinformatics; disease-causing variants; evaluation; prediction; protein disorder.
© 2014 WILEY PERIODICALS, INC.