Text Similarity Calculation based on Domain Ontology and Concept Clustering
Download as PDF
DOI: 10.25236/icmit.2017.78
Author(s)
ZhiQiang Zhang, Lili Gao
Corresponding Author
ZhiQiang Zhang
Abstract
A swarm intelligence-based web document clustering algorithm is proposed in this paper. The main process of this algorithm is to firstly adopt the vector space modal (VSM) to represent the information of web document. The conventional method is adopted, as eliminating the reduction rule of useless words and feature words to acquire the textual characteristic set, and further the document vector is randomly distributed to a plane. The document is clustered through adopting the swarm intelligence-based web document clustering method. Eventually, the clustering result is collected through adopting recursive algorithm. As the experimental result bespeaks, the swarm intelligence-based web document clustering algorithm has the better clustering characteristics. It is able to completely and accurately cluster web document related to the subject. Additionally through comparative analysis from multiple aspects, the clustering results of swarm intelligence-based web document clustering algorithm shall be superior to the clustering result acquired by SOM algorithm.
Keywords
Semantic Text, Domain Ontology, Concept, Document Clustering, Similarity.