Link Clustering with Extended Link Similarity and EQ Evaluation Division

PLoS One. 2013 Jun 19;8(6):e66005. doi: 10.1371/journal.pone.0066005. Print 2013.

Abstract

Link Clustering (LC) is a relatively new method for detecting overlapping communities in networks. The basic principle of LC is to derive a transform matrix whose elements are composed of the link similarity of neighbor links based on the Jaccard distance calculation; then it applies hierarchical clustering to the transform matrix and uses a measure of partition density on the resulting dendrogram to determine the cut level for best community detection. However, the original link clustering method does not consider the link similarity of non-neighbor links, and the partition density tends to divide the communities into many small communities. In this paper, an Extended Link Clustering method (ELC) for overlapping community detection is proposed. The improved method employs a new link similarity, Extended Link Similarity (ELS), to produce a denser transform matrix, and uses the maximum value of EQ (an extended measure of quality of modularity) as a means to optimally cut the dendrogram for better partitioning of the original network space. Since ELS uses more link information, the resulting transform matrix provides a superior basis for clustering and analysis. Further, using the EQ value to find the best level for the hierarchical clustering dendrogram division, we obtain communities that are more sensible and reasonable than the ones obtained by the partition density evaluation. Experimentation on five real-world networks and artificially-generated networks shows that the ELC method achieves higher EQ and In-group Proportion (IGP) values. Additionally, communities are more realistic than those generated by either of the original LC method or the classical CPM method.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis*
  • Models, Theoretical

Grants and funding

This work is supported by the Natural Science Foundation of China (60903097) and Jilin Innovation Team Project (20122805) and assisted by Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.