The best way to conference proceedings by Francis Academic Press

Web of Proceedings - Francis Academic Press
Web of Proceedings - Francis Academic Press

Research on Decision Strategy Algorithms of Web Crawler Vector Space Model Based on Schmidt Orthogonal Optimization

Download as PDF

DOI: 10.25236/ciais.2019.017

Author(s)

Mu Yang and Yanmei Hu

Corresponding Author

Yanmei Hu

Abstract

In this paper, a decision strategy algorithm for vector space model of web crawler based on Schmidt orthogonal optimization is constructed. The algorithm uses Schmidt method to orthogonally optimize the classified vectors in the vector space model, corrects the base coordinate axis of small angle projection by orthogonal optimization, realizes the rotation of vectors in the coordinate axis, and removes the document vector. The influence of relevance on classification can improve the retrieval accuracy. KNN algorithm is used to classify the orthogonal optimized document vectors. Experiments show that this method can greatly eliminate the influence of correlation on retrieval results in the process of document classification of web crawler, and improve the accuracy of document classification.

Keywords

Search engines; Schmidt orthogonal; Vector space model; Text classification