Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 100 (21), 12123-8

Protein Complexes and Functional Modules in Molecular Networks

Affiliations

Protein Complexes and Functional Modules in Molecular Networks

Victor Spirin et al. Proc Natl Acad Sci U S A.

Abstract

Proteins, nucleic acids, and small molecules form a dense network of molecular interactions in a cell. Molecules are nodes of this network, and the interactions between them are edges. The architecture of molecular networks can reveal important principles of cellular organization and function, similarly to the way that protein structure tells us about the function and organization of a protein. Computational analysis of molecular networks has been primarily concerned with node degree [Wagner, A. & Fell, D. A. (2001) Proc. R. Soc. London Ser. B 268, 1803-1810; Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N. & Barabasi, A. L. (2000) Nature 407, 651-654] or degree correlation [Maslov, S. & Sneppen, K. (2002) Science 296, 910-913], and hence focused on single/two-body properties of these networks. Here, by analyzing the multibody structure of the network of protein-protein interactions, we discovered molecular modules that are densely connected within themselves but sparsely connected with the rest of the network. Comparison with experimental data and functional annotation of genes showed two types of modules: (i) protein complexes (splicing machinery, transcription factors, etc.) and (ii) dynamic functional units (signaling cascades, cell-cycle regulation, etc.). Discovered modules are highly statistically significant, as is evident from comparison with random graphs, and are robust to noise in the data. Our results provide strong support for the network modularity principle introduced by Hartwell et al. [Hartwell, L. H., Hopfield, J. J., Leibler, S. & Murray, A. W. (1999) Nature 402, C47-C52], suggesting that found modules constitute the "building blocks" of molecular networks.

Figures

Fig. 1.
Fig. 1.
Statistical significance of complexes and modules. (A) Number of complete cliques (Q = 1) as a function of clique size enumerated in the network of protein interactions (red) and in randomly rewired graphs (blue, averaged >1,000 graphs). Inset shows the same plot in log-normal scale. Note the dramatic enrichment in the number of cliques in the protein-interaction graph. Most of these cliques are parts of bigger complexes and modules. (B) Distribution of Q of clusters found by the MC search procedure in the randomly rewired graphs (blue bars). The blue line shows approximation of this distribution by the Fisher–Tippett extreme value distribution (EVD) with two fitted parameters. Red bars show complexes found in the original network of protein interactions. Sizes of the subgraphs are n = 8, 10, and 16. Note that real complexes have many more interactions than the tightest complexes found in randomly rewired graphs.
Fig. 2.
Fig. 2.
Fragment of the protein network. Nodes and interactions in discovered clusters are shown in bold. Nodes are colored by functional categories in MIPS (20): red, transcription regulation; blue, cell-cycle/cell-fate control; green, RNA processing; and yellow, protein transport. Complexes shown are the SAGA/TFIID complex (red), the anaphase-promoting complex (blue), and the TRAPP complex (yellow).
Fig. 3.
Fig. 3.
Examples of discovered functional modules. (A) A module involved in cell-cycle regulation. This module consists of cyclins (CLB1-4 and CLN2) and cyclin-dependent kinases (CKS1 and CDC28) and a nuclear import protein (NIP29). Although they have many interactions, these proteins are not present in the cell at the same time. (B) Pheromone signal transduction pathway in the network of protein–protein interactions. This module includes several MAPK (mitogen-activated protein kinase) and MAPKK (mitogen-activated protein kinase kinase) kinases, as well as other proteins involved in signal transduction. These proteins do not form a single complex; rather, they interact in a specific order.
Fig. 4.
Fig. 4.
Comparison of discovered complexes and modules with complexes derived experimentally (BIND and Cellzome) and complexes catalogued in MIPS. Discovered complexes are sorted by the overlap with the best-matching experimental complex (see Methods and Supporting Text). The overlap is defined as the number of common proteins divided by the number of proteins in the best-matching experimental complex. The first 31 complexes match exactly, and another 11 have overlap above 65%. Inset shows the overlap as a function of the size of the discovered complex. Note that discovered complexes of all sizes match very well with known experimental complexes. Discovered complexes that do not match with experimental ones constitute our predictions (see Discussion for details).
Fig. 5.
Fig. 5.
The fraction of clusters recovered in the randomly perturbed network, as a function of the fraction of altered links. Black curves correspond to the case when links are randomly rewired; red, randomly removed (true negatives); and green, randomly added (false positives). The original cluster is said to be recovered if the perturbed network has a cluster that shares at least 50% of the nodes with the original one. Each perturbation was repeated 10 times. Also see Fig. 9, which is published as supporting information on the PNAS web site.

Similar articles

See all similar articles

Cited by 362 PubMed Central articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback