Research on Automatic Splicing Technology of Chinese File Fragment based on MATLAB Rules
Download as PDF
DOI: 10.25236/dpaic.2018.018
Corresponding Author
Liang He
Abstract
In order to realize the splicing of Chinese text fragments in rules, this paper studies the characteristics of Chinese character text in the rule fragment file, proposes the extraction method of text line information in the file fragment, and defines the concept of fragment boundary degree based on L1-norm, which is based on 0. -1 Planned file fragmentation model and using cluster analysis to reduce algorithm complexity. Compared with the existing similar algorithms, the algorithm of this paper can complete the correct splicing without manual intervention.
Keywords
Rule Fragmentation, 0-1 Planning, Cluster Analysis, Text Feature Extraction, L1-norm