The best way to conference proceedings by Francis Academic Press

Web of Proceedings - Francis Academic Press
Web of Proceedings - Francis Academic Press

Research on Network Public Opinion Text Representation Strategy for Subject Classification——Taking Sina Weibo as an Example

Download as PDF

DOI: 10.25236/iciss.2019.058

Author(s)

Longjia Jia, and Kun Hou

Corresponding Author

Longjia Jia

Abstract

In this paper, we propose a text representation strategy, which solves the problem that term weights of Sina Weibo topic classification research are not suitable and the model explanatory is not strong. In the proposed document representation strategy, term weighting vector is constructed by taking pre-selection prediction. On training set, the effectiveness of term weighting vector is evaluated by cross-validation, and term weighting vector corresponding to the best evaluation result is selected as term weighting vector of test set. Compared with traditional W-Max, D-Max and D-TMax methods, the proposed method increases 4.25%, 5.03% and 7.10% respectively in MicroF1. In classification of public opinion topics, the proposed method can construct a more explicit term weighting vector for data set. It can enhance the interpretability of the model, and improve the classification performance.

Keywords

Internet public opinion security, Theme classification, Text representation strategy, Machine learning