How Human is Machine Translationese? Comparing Human and Machine Translations of Text and Speech

Yuri Bizzoni, Tom S Juzek, Cristina España-Bonet, Koel Dutta Chowdhury, Josef van Genabith, Elke Teich


Abstract
Translationese is a phenomenon present in human translations, simultaneous interpreting, and even machine translations. Some translationese features tend to appear in simultaneous interpreting with higher frequency than in human text translation, but the reasons for this are unclear. This study analyzes translationese patterns in translation, interpreting, and machine translation outputs in order to explore possible reasons. In our analysis we – (i) detail two non-invasive ways of detecting translationese and (ii) compare translationese across human and machine translations from text and speech. We find that machine translation shows traces of translationese, but does not reproduce the patterns found in human translation, offering support to the hypothesis that such patterns are due to the model (human vs machine) rather than to the data (written vs spoken).
Anthology ID:
2020.iwslt-1.34
Volume:
Proceedings of the 17th International Conference on Spoken Language Translation
Month:
July
Year:
2020
Address:
Online
Venues:
ACL | IWSLT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
280–290
Language:
URL:
https://www.aclweb.org/anthology/2020.iwslt-1.34
DOI:
10.18653/v1/2020.iwslt-1.34
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.iwslt-1.34.pdf
Video:
 http://slideslive.com/38929601