Predicting Protein Relationships to Human Pathways through a Relational Learning Approach Based on Simple Sequence Features

IEEE/ACM Trans Comput Biol Bioinform. 2014 Jul-Aug;11(4):753-65. doi: 10.1109/TCBB.2014.2318730.


Biological pathways are important elements of systems biology and in the past decade, an increasing number of pathway databases have been set up to document the growing understanding of complex cellular processes. Although more genome-sequence data are becoming available, a large fraction of it remains functionally uncharacterized. Thus, it is important to be able to predict the mapping of poorly annotated proteins to original pathway models.

Results: We have developed a Relational Learning-based Extension (RLE) system to investigate pathway membership through a function prediction approach that mainly relies on combinations of simple properties attributed to each protein. RLE searches for proteins with molecular similarities to specific pathway components. Using RLE, we associated 383 uncharacterized proteins to 28 pre-defined human Reactome pathways, demonstrating relative confidence after proper evaluation. Indeed, in specific cases manual inspection of the database annotations and the related literature supported the proposed classifications. Examples of possible additional components of the Electron transport system, Telomere maintenance and Integrin cell surface interactions pathways are discussed in detail.

Availability: All the human predicted proteins in the 2009 and 2012 releases 30 and 40 of Reactome are available at

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Humans
  • Machine Learning
  • Models, Statistical
  • Protein Interaction Maps / physiology*
  • Proteins / physiology*
  • Signal Transduction / physiology*
  • Systems Biology / methods*


  • Proteins