Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2015 May 15;31(10):1632-9.
doi: 10.1093/bioinformatics/btv026. Epub 2015 Jan 20.

Topology-function conservation in protein-protein interaction networks

Affiliations
Comparative Study

Topology-function conservation in protein-protein interaction networks

Darren Davis et al. Bioinformatics. .

Abstract

Motivation: Proteins underlay the functioning of a cell and the wiring of proteins in protein-protein interaction network (PIN) relates to their biological functions. Proteins with similar wiring in the PIN (topology around them) have been shown to have similar functions. This property has been successfully exploited for predicting protein functions. Topological similarity is also used to guide network alignment algorithms that find similarly wired proteins between PINs of different species; these similarities are used to transfer annotation across PINs, e.g. from model organisms to human. To refine these functional predictions and annotation transfers, we need to gain insight into the variability of the topology-function relationships. For example, a function may be significantly associated with specific topologies, while another function may be weakly associated with several different topologies. Also, the topology-function relationships may differ between different species.

Results: To improve our understanding of topology-function relationships and of their conservation among species, we develop a statistical framework that is built upon canonical correlation analysis. Using the graphlet degrees to represent the wiring around proteins in PINs and gene ontology (GO) annotations to describe their functions, our framework: (i) characterizes statistically significant topology-function relationships in a given species, and (ii) uncovers the functions that have conserved topology in PINs of different species, which we term topologically orthologous functions. We apply our framework to PINs of yeast and human, identifying seven biological process and two cellular component GO terms to be topologically orthologous for the two organisms.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Graphlets. (A) The thirty 2- to 5-node graphlets, denoted by G0,…,G29 and their 73 automorphism orbits, denoted by 0,1,…,72 (Pržulj, 2007). (B) An illustration of the GDV of node v, e.g. node v is touched by four edges (orbit 0—illustrated in the left panel), one triangle (orbit 3—illustrated in the middle panel) and one four-node cycle (orbit 8—illustrated in the right panel). In this way, GDV quantifies the wiring of a node in the network (Milenković and Pržulj, 2008)
Fig. 2.
Fig. 2.
Our method for identifying the species-consistent relationships between network topology and biological function. Panel A illustrates the association matrix construction from CCA. CCA identifies weight matrices W1 and W2 that maximize the Pearson’s Correlation between the resulting canonical variates. These weight matrices are used for defining an association matrix that transforms GDVs to topology-based GO annotations. Panel B shows the process of identifying and characterizing single-species topology-function associations. The association matrix is used for computing the topology-based GO annotations that explain how strongly each GO term is associated with a given GDV. The Pearson’s Correlation between the topology-based GO annotations and observed GO annotations give the structure association strengths that indicate the extent to which each GO term is associated with network structure. The Pearson’s Correlation between the GDVs and the topology-based GO annotations give the orbit contribution strengths that explain the involvement of each orbit in the topology-function association per GO-term. Panel C illustrates the identification of orthologous topology-function associations. For a pair of species, the multi-species structure association strength can be computed by taking the minimum of the two per-species structure association strengths. Orbit contribution similarities for the GO terms can be quantified via the Spearman’s Correlation of the per-species orbit contribution strengths
Fig. 3.
Fig. 3.
The orbit contribution strength profiles of non-redundant terms that have significantly conserved topology-function relationships. The heatmap at the top summarizes the significant patterns for the non-redundant BP terms, and the heatmap at the bottom summarizes the significant patterns for the non-redundant CC terms. Each heatmap row corresponds to the average orbit contribution strength profile of the GO term that represent the redundant group. Each heatmap cell represents the maximum orbit contribution strength in the relevant orbit group (Fig. 1A for illustrations of orbits 0,1,…,72). For illustrative purposes, graphlet orbits are grouped based on the similarity of their graphlet degrees following the methodology of Yaveroğlu et al. (2014) (explained in Supplementary Section S.8). The orbit groups that do not have any significantly high orbit contribution strengths are coloured semi-transparently. Note that cells plotted with solid colours do not mean that all orbits in the relevant group have significant relationships with the GO term, but it means that at least one of the orbits has a significant relationship (for the exact list of significant orbits; Supplementary Data S1). Black nodes of the graphlets on the right denote the orbits of the corresponding column in the heatmap
Fig. 4.
Fig. 4.
Illustration of the identified topological characteristics in case studies 1 and 2. The small circles represent proteins and the lines connecting them represent edges. The cellular localization term is identified to be significantly linked with mediator positions in sparse graphlets; i.e. graphlet orbits 0, 2, 7, 11, 16, 21, 23, 33, 42 and 44. We illustrate such connectivity patterns for the proteins of this function on the cell membrane, over the green membrane pores (circles filled with red). The DNA-dependent transcription initiation term is identified to be significantly linked with dense connections and clique-like patterns; i.e. 3, 13, 14, 58, 61, 67, 69, 71 and 72. We illustrate such connectivity patterns inside the nucleus, over the transcription factors and DNA (circles filled with blue)

Similar articles

Cited by

References

    1. Ashburner M., et al. . (2000) Gene ontology: tool for the unification of biology. Nat. Genet. , 25, 25–29. - PMC - PubMed
    1. Biddick R., Young E.T. (2005) Yeast mediator and its role in transcriptional regulation. C. R. Biol. , 328, 773–782. - PubMed
    1. Borukhov S., Nudler E. (2008) RNA polymerase: the vehicle of transcription. Trends Microbiol. , 16, 126–134. - PubMed
    1. Chua H.N., et al. . (2006) Exploiting indirect neighbours and topological weight to predict protein function from protein–protein interactions. Bioinformatics , 22, 1623–1630. - PubMed
    1. Clark C., Kalita J. (2014) A comparison of algorithms for the pairwise alignment of biological networks. Bioinformatics , 30, 2351–2359. - PubMed

Publication types