Reliability measures for membrane protein topology prediction algorithms

J Mol Biol. 2003 Mar 28;327(3):735-44. doi: 10.1016/s0022-2836(03)00182-7.

Abstract

We have developed reliability scores for five widely used membrane protein topology prediction methods, and have applied them both on a test set of 92 bacterial plasma membrane proteins with experimentally determined topologies and on all predicted helix bundle membrane proteins in three fully sequenced genomes: Escherichia coli, Saccharomyces cerevisiae and Caenorhabditis elegans. We show that the reliability scores work well for the TMHMM and MEMSAT methods, and that they allow the probability that the predicted topology is correct to be estimated for any protein. We further show that the available test set is biased towards high-scoring proteins when compared to the genome-wide data sets, and provide estimates for the expected prediction accuracy of TMHMM across the three genomes. Finally, we show that the performance of TMHMM is considerably better when limited experimental information (such as the in/out location of a protein's C terminus) is available, and estimate that at least ten percentage points in overall accuracy in whole-genome predictions can be gained in this way.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Caenorhabditis elegans / metabolism
  • Cell Membrane / metabolism*
  • Computational Biology / methods*
  • Databases as Topic
  • Escherichia coli / metabolism
  • Protein Conformation
  • Protein Structure, Tertiary
  • Proteome*
  • Reproducibility of Results
  • Saccharomyces cerevisiae / metabolism
  • Software

Substances

  • Proteome