Research on Decision Strategy Algorithms of Web Crawler Vector Space Model Based on Schmidt Orthogonal Optimization
Download as PDF
DOI: 10.25236/ciais.2019.017
Author(s)
Mu Yang and Yanmei Hu
Corresponding Author
Yanmei Hu
Abstract
In this paper, a decision strategy algorithm for vector space model of web crawler based on Schmidt orthogonal optimization is constructed. The algorithm uses Schmidt method to orthogonally optimize the classified vectors in the vector space model, corrects the base coordinate axis of small angle projection by orthogonal optimization, realizes the rotation of vectors in the coordinate axis, and removes the document vector. The influence of relevance on classification can improve the retrieval accuracy. KNN algorithm is used to classify the orthogonal optimized document vectors. Experiments show that this method can greatly eliminate the influence of correlation on retrieval results in the process of document classification of web crawler, and improve the accuracy of document classification.
Keywords
Search engines; Schmidt orthogonal; Vector space model; Text classification