Spectral affinity in protein networks

BMC Syst Biol. 2009 Nov 29:3:112. doi: 10.1186/1752-0509-3-112.

Abstract

Background: Protein-protein interaction (PPI) networks enable us to better understand the functional organization of the proteome. We can learn a lot about a particular protein by querying its neighborhood in a PPI network to find proteins with similar function. A spectral approach that considers random walks between nodes of interest is particularly useful in evaluating closeness in PPI networks. Spectral measures of closeness are more robust to noise in the data and are more precise than simpler methods based on edge density and shortest path length.

Results: We develop a novel affinity measure for pairs of proteins in PPI networks, which uses personalized PageRank, a random walk based method used in context-sensitive search on the Web. Our measure of closeness, which we call PageRank Affinity, is proportional to the number of times the smaller-degree protein is visited in a random walk that restarts at the larger-degree protein. PageRank considers paths of all lengths in a network, therefore PageRank Affinity is a precise measure that is robust to noise in the data. PageRank Affinity is also provably related to cluster co-membership, making it a meaningful measure. In our experiments on protein networks we find that our measure is better at predicting co-complex membership and finding functionally related proteins than other commonly used measures of closeness. Moreover, our experiments indicate that PageRank Affinity is very resilient to noise in the network. In addition, based on our method we build a tool that quickly finds nodes closest to a queried protein in any protein network, and easily scales to much larger biological networks.

Conclusion: We define a meaningful way to assess the closeness of two proteins in a PPI network, and show that our closeness measure is more biologically significant than other commonly used methods. We also develop a tool, accessible at http://xialab.bu.edu/resources/pnns, that allows the user to quickly find nodes closest to a queried vertex in any protein network available from BioGRID or specified by the user.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Computational Biology / methods*
  • Models, Theoretical*
  • Protein Interaction Mapping / methods*
  • Proteins / metabolism*
  • Software*

Substances

  • Proteins