Text Similarity Calculation based on Domain Ontology and Concept Clustering

ZhiQiang Zhang, Lili Gao

Text Similarity Calculation based on Domain Ontology and Concept Clustering

Download as PDF

DOI: 10.25236/icmit.2017.78

Author(s)

ZhiQiang Zhang, Lili Gao

Corresponding Author

ZhiQiang Zhang

Abstract

A swarm intelligence-based web document clustering algorithm is proposed in this paper. The main process of this algorithm is to firstly adopt the vector space modal (VSM) to represent the information of web document. The conventional method is adopted, as eliminating the reduction rule of useless words and feature words to acquire the textual characteristic set, and further the document vector is randomly distributed to a plane. The document is clustered through adopting the swarm intelligence-based web document clustering method. Eventually, the clustering result is collected through adopting recursive algorithm. As the experimental result bespeaks, the swarm intelligence-based web document clustering algorithm has the better clustering characteristics. It is able to completely and accurately cluster web document related to the subject. Additionally through comparative analysis from multiple aspects, the clustering results of swarm intelligence-based web document clustering algorithm shall be superior to the clustering result acquired by SOM algorithm.

Keywords

Semantic Text, Domain Ontology, Concept, Document Clustering, Similarity.