The Design and Implementation of Chinese-Uyghur Printed Document Retrieval System Based On OCR
Download as PDF
DOI: 10.25236/icemit.2019.032
Author(s)
Eliyas Suleyman, Abdusalam Dawut, Palidan Tuerxun, Askar Hamdulla
Corresponding Author
Askar Hamdulla
Abstract
This paper focuses on the overall framework and functions of Chinese-Uyghur printed document retrieval system based on Optical Character Recognition (OCR) system. This system mainly fulfills a Chinese and Uyghur document retrieval by inputting the keyword. The proposed system consists three steps to localize the key word in a document image. Firstly the document image is segmented to basic units which is words. Secondly the entire content of the document image is recognized by using OCR. Then the key word localization is applied. Due to the proposed system is based on OCR system, hence, the precision of localization of a key word is highly depend on the accuracy of the applied OCR system.
Keywords
Document Retrieval, Key Word Localization, Document Segmentation