Automatic metrics for the evaluation of machine translation (MT) compute scores that characterize globally certain aspects of MT quality such as adequacy and fluency. This paper introduces a reference-based metric that is focused on a particular class of function words, namely discourse connectives, of particular importance for text structuring, and rather challenging for MT. To measure the accuracy of connective translation (ACT), the metric relies on automatic word-level alignment between a source sentence and respectively the reference and candidate translations, along with other heuristics for comparing translations of discourse connectives. Using a dictionary of equivalents, the translations are scored automatically, or, for better precision, semi-automatically. The precision of the ACT metric is assessed by human judges on sample data for English/French and English/Arabic translations: the ACT scores are on average within 2% of human scores. The ACT metric is then applied to several commercial and research MT systems, providing an assessment of their performance on discourse connectives.
Matthias Finger, Qian Wang, Yiming Li, Varun Sharma, Konstantin Androsov, Jan Steggemann, Xin Chen, Rakesh Chawla, Matteo Galli, Jian Wang, João Miguel das Neves Duarte, Tagir Aushev, Matthias Wolf, Yi Zhang, Tian Cheng, Yixing Chen, Werner Lustermann, Andromachi Tsirou, Alexis Kalogeropoulos, Andrea Rizzi, Ioannis Papadopoulos, Paolo Ronchese, Hua Zhang, Leonardo Cristella, Siyuan Wang, Tao Huang, David Vannerom, Michele Bianco, Sebastiana Gianì, Sun Hee Kim, Davide Di Croce, Kun Shi, Abhisek Datta, Jian Zhao, Federica Legger, Gabriele Grosso, Anna Mascellani, Ji Hyun Kim, Donghyun Kim, Zheng Wang, Sanjeev Kumar, Wei Li, Yong Yang, Ajay Kumar, Ashish Sharma, Georgios Anagnostou, Joao Varela, Csaba Hajdu, Muhammad Ahmad, Ekaterina Kuznetsova, Ioannis Evangelou, Milos Dordevic, Meng Xiao, Sourav Sen, Xiao Wang, Kai Yi, Jing Li, Rajat Gupta, Hui Wang, Seungkyu Ha, Pratyush Das, Anton Petrov, Xin Sun, Valérie Scheurer, Muhammad Ansar Iqbal, Lukas Layer