The best way to conference proceedings by Francis Academic Press

Web of Proceedings - Francis Academic Press
Web of Proceedings - Francis Academic Press

The Relationship between Syntactic Complexity and Quality of Nmt Outputs: an Exploratory Study

Download as PDF

DOI: 10.25236/icbsis.2020.027

Author(s)

Huang Yueyue, Li Keru

Corresponding Author

Huang Yueyue, Li Keru

Abstract

Propelled by automated translation technology, translators in the current times are in urgent need of more precise guidelines on the ramification of post-editing tasks. Set in English-Chinese pairs, this paper attempts to explore the relationship between syntactic complexity of source text (ST) and quality of Neutral Machine Translation (NMT) output. 40 sentences were extracted from two pieces of legal documents. Three groups were formed based on sentence length: the first group includes 20 sentences with 7-to-36-word length range, the second includes another 20 sentences with longer length ranging from 31 to 68 words, and the third comprises two previous groups as combined to test overall correlation. Syntactic Complexity Analyzer developed by Lu (2010) was adopted to measure the 40 sentences, which were then processed by two versions of free online NMT systems-Google Neural Machine Translation (GNMT) and Systran online translation tools. MT quality evaluation was carried out manually by counting errors at lexical and syntactic level. The overall results suggest a small-to-medium effect size from ST syntactic complexity for NMT quality regardless of different NMT systems, and T-unit-related complexity measurements, mean length of T-unit (MLT) in particular, account for most such correlation. Also, whereas GNMT output quality at lexical level scores significantly higher than that of Systran, error scoring for both systems at syntactic level does not vary significantly.

Keywords

Syntactic complexity, neural machine translation, Quality evaluation