Link-Prediction Enhanced Consensus Clustering for Complex Networks

PLoS One. 2016 May 20;11(5):e0153384. doi: 10.1371/journal.pone.0153384. eCollection 2016.

Abstract

Many real networks that are collected or inferred from data are incomplete due to missing edges. Missing edges can be inherent to the dataset (Facebook friend links will never be complete) or the result of sampling (one may only have access to a portion of the data). The consequence is that downstream analyses that "consume" the network will often yield less accurate results than if the edges were complete. Community detection algorithms, in particular, often suffer when critical intra-community edges are missing. We propose a novel consensus clustering algorithm to enhance community detection on incomplete networks. Our framework utilizes existing community detection algorithms that process networks imputed by our link prediction based sampling algorithm and merges their multiple partitions into a final consensus output. On average our method boosts performance of existing algorithms by 7% on artificial data and 17% on ego networks collected from Facebook.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Cluster Analysis*
  • Consensus*
  • Neural Networks, Computer*

Grants and funding

This work was supported by National Science Foundation grant IGERT-0903629 (http://www.nsf.gov/awardsearch/showAward?AWD_ID=0903629) and Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number D11PC20155 (http://www.iarpa.gov/index.php/research-programs/fuse). the funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.