Clustering by passing messages between data points
- PMID: 17218491
- DOI: 10.1126/science.1136800
Clustering by passing messages between data points
Abstract
Clustering data by identifying a subset of representative examples is important for processing sensory signals and detecting patterns in data. Such "exemplars" can be found by randomly choosing an initial subset of data points and then iteratively refining it, but this works well only if that initial choice is close to a good solution. We devised a method called "affinity propagation," which takes as input measures of similarity between pairs of data points. Real-valued messages are exchanged between data points until a high-quality set of exemplars and corresponding clusters gradually emerges. We used affinity propagation to cluster images of faces, detect genes in microarray data, identify representative sentences in this manuscript, and identify cities that are efficiently accessed by airline travel. Affinity propagation found clusters with much lower error than other methods, and it did so in less than one-hundredth the amount of time.
Comment in
-
Computer science. Where are the exemplars?Science. 2007 Feb 16;315(5814):949-51. doi: 10.1126/science.1139678. Science. 2007. PMID: 17303742 No abstract available.
-
Comment on "Clustering by passing messages between data points".Science. 2008 Feb 8;319(5864):726; author reply 726. doi: 10.1126/science.1150938. Science. 2008. PMID: 18258881
Similar articles
-
Comment on "Clustering by passing messages between data points".Science. 2008 Feb 8;319(5864):726; author reply 726. doi: 10.1126/science.1150938. Science. 2008. PMID: 18258881
-
Analysis of activity in fMRI data using affinity propagation clustering.Comput Methods Biomech Biomed Engin. 2011 Mar;14(3):271-81. doi: 10.1080/10255841003766829. Comput Methods Biomech Biomed Engin. 2011. PMID: 21347914
-
Clustering of change patterns using Fourier coefficients.Bioinformatics. 2008 Jan 15;24(2):184-91. doi: 10.1093/bioinformatics/btm568. Epub 2007 Nov 19. Bioinformatics. 2008. PMID: 18025003
-
Gene Ontology analysis in multiple gene clusters under multiple hypothesis testing framework.Artif Intell Med. 2007 Oct;41(2):105-15. doi: 10.1016/j.artmed.2007.08.002. Artif Intell Med. 2007. PMID: 17913480
-
Analysis of a Gibbs sampler method for model-based clustering of gene expression data.Bioinformatics. 2008 Jan 15;24(2):176-83. doi: 10.1093/bioinformatics/btm562. Epub 2007 Nov 22. Bioinformatics. 2008. PMID: 18033794
Cited by
-
Do antibody CDR loops change conformation upon binding?MAbs. 2024 Jan-Dec;16(1):2322533. doi: 10.1080/19420862.2024.2322533. Epub 2024 Mar 13. MAbs. 2024. PMID: 38477253 Free PMC article.
-
A clustering effectiveness measurement model based on merging similar clusters.PeerJ Comput Sci. 2024 Feb 29;10:e1863. doi: 10.7717/peerj-cs.1863. eCollection 2024. PeerJ Comput Sci. 2024. PMID: 38435574 Free PMC article.
-
Differences in gut microbiota between Dutch and South-Asian Surinamese: potential implications for type 2 diabetes mellitus.Sci Rep. 2024 Feb 26;14(1):4585. doi: 10.1038/s41598-024-54769-4. Sci Rep. 2024. PMID: 38403716 Free PMC article.
-
DeepSLICEM: Clustering CryoEM particles using deep image and similarity graph representations.bioRxiv [Preprint]. 2024 Feb 8:2024.02.04.578778. doi: 10.1101/2024.02.04.578778. bioRxiv. 2024. PMID: 38370702 Free PMC article. Preprint.
-
Integrated multiplexed assays of variant effect reveal determinants of catechol-O-methyltransferase gene expression.Mol Syst Biol. 2024 Feb 14. doi: 10.1038/s44320-024-00018-9. Online ahead of print. Mol Syst Biol. 2024. PMID: 38355921
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
