Extensive feature detection of N-terminal protein sorting signals

Hideo Bannai; Yoshinori Tamada; Osamu Maruyama; Kenta Nakai; Satoru Miyano

doi:10.1093/bioinformatics/18.2.298

Extensive feature detection of N-terminal protein sorting signals

Bioinformatics. 2002 Feb;18(2):298-305. doi: 10.1093/bioinformatics/18.2.298.

Authors

Hideo Bannai¹, Yoshinori Tamada, Osamu Maruyama, Kenta Nakai, Satoru Miyano

Affiliation

¹ Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan. bannai@ims.u-tokyo.ac.jp

PMID: 11847077
DOI: 10.1093/bioinformatics/18.2.298

Abstract

Motivation: The prediction of localization sites of various proteins is an important and challenging problem in the field of molecular biology. TargetP, by Emanuelsson et al. (J. Mol. Biol., 300, 1005-1016, 2000) is a neural network based system which is currently the best predictor in the literature for N-terminal sorting signals. One drawback of neural networks, however, is that it is generally difficult to understand and interpret how and why they make such predictions. In this paper, we aim to generate simple and interpretable rules as predictors, and still achieve a practical prediction accuracy. We adopt an approach which consists of an extensive search for simple rules and various attributes which is partially guided by human intuition.

Results: We have succeeded in finding rules whose prediction accuracies come close to that of TargetP, while still retaining a very simple and interpretable form. We also discuss and interpret the discovered rules.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Amino Acid Sequence
Computational Biology*
Neural Networks, Computer
Protein Sorting Signals / genetics
Proteins / genetics*
Proteins / metabolism*
Software
Subcellular Fractions / metabolism

Substances

Protein Sorting Signals
Proteins