Towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension
- PMID: 25794172
- PMCID: PMC4367988
- DOI: 10.1371/journal.pone.0117390
Towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension
Abstract
The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach.
Conflict of interest statement
Figures
Similar articles
-
An Efficient Parallelized Ontology Network-Based Semantic Similarity Measure for Big Biomedical Document Clustering.Comput Math Methods Med. 2021 Nov 9;2021:7937573. doi: 10.1155/2021/7937573. eCollection 2021. Comput Math Methods Med. 2021. PMID: 34795792 Free PMC article.
-
A knowledge-driven approach to biomedical document conceptualization.Artif Intell Med. 2010 Jun;49(2):67-78. doi: 10.1016/j.artmed.2010.02.005. Epub 2010 Apr 3. Artif Intell Med. 2010. PMID: 20371168
-
Modeling semantic aspects for cross-media image indexing.IEEE Trans Pattern Anal Mach Intell. 2007 Oct;29(10):1802-17. doi: 10.1109/TPAMI.2007.1097. IEEE Trans Pattern Anal Mach Intell. 2007. PMID: 17699924
-
Towards Semantic e-Science for Traditional Chinese Medicine.BMC Bioinformatics. 2007 May 9;8 Suppl 3(Suppl 3):S6. doi: 10.1186/1471-2105-8-S3-S6. BMC Bioinformatics. 2007. PMID: 17493289 Free PMC article. Review.
-
Using semantic similarity to understand the psychological constructs related to prosociality.Curr Opin Psychol. 2022 Apr;44:226-230. doi: 10.1016/j.copsyc.2021.09.019. Epub 2021 Oct 2. Curr Opin Psychol. 2022. PMID: 34749239 Review.
References
-
- Kaski S, Honkela T, Lagus K, Kohonen T (1998) WEBSOM-Self Organizing Maps of Document Collections. Neurocomputing, Vol 21, 1998:l0l–117.
-
- Chim H, Xiaotie D (2008) Efficient phrase-based document similarity for clustering. IEEE Transactions on Knowledge and Data Engineering, v 20, n 9, September, 2008:1217–1229.
-
- Guerrero R, Vincent P, Moya A, Victor H (2002) Document Organization using Kohonen's Algorithm. Information Processing and Management, Vol 38, No 1, 2002:79–89.
-
- Shan C, Damminda A, et al. (2005) Building an adaptive hierarchy of clusters for text data. International Conference on Computational Intelligence for Modeling, Control and Automation, 2005:7–12.
-
- Merkl D (1998) Text classification with self-organizing maps: Some lessons learned. Neurocomputing, vol. 21, no. 1–3, 1998: 61–77.
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
