Dario Stojanovski


2020

pdf bib
ContraCAT: Contrastive Coreference Analytical Templates for Machine Translation
Dario Stojanovski | Benno Krojer | Denis Peskov | Alexander Fraser
Proceedings of the 28th International Conference on Computational Linguistics

Recent high scores on pronoun translation using context-aware neural machine translation have suggested that current approaches work well. ContraPro is a notable example of a contrastive challenge set for English→German pronoun translation. The high scores achieved by transformer models may suggest that they are able to effectively model the complicated set of inferences required to carry out pronoun translation. This entails the ability to determine which entities could be referred to, identify which entity a source-language pronoun refers to (if any), and access the target-language grammatical gender for that entity. We first show through a series of targeted adversarial attacks that in fact current approaches are not able to model all of this information well. Inserting small amounts of distracting information is enough to strongly reduce scores, which should not be the case. We then create a new template test set ContraCAT, designed to individually assess the ability to handle the specific steps necessary for successful pronoun translation. Our analyses show that current approaches to context-aware NMT rely on a set of surface heuristics, which break down when translations require real reasoning. We also propose an approach for augmenting the training data, with some improvements.

pdf bib
Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT
Alexandra Chronopoulou | Dario Stojanovski | Alexander Fraser
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Using a language model (LM) pretrained on two languages with large monolingual data in order to initialize an unsupervised neural machine translation (UNMT) system yields state-of-the-art results. When limited data is available for one language, however, this method leads to poor translations. We present an effective approach that reuses an LM that is pretrained only on the high-resource language. The monolingual LM is fine-tuned on both languages and is then used to initialize a UNMT model. To reuse the pretrained LM, we have to modify its predefined vocabulary, to account for the new language. We therefore propose a novel vocabulary extension method. Our approach, RE-LM, outperforms a competitive cross-lingual pretraining model (XLM) in English-Macedonian (En-Mk) and English-Albanian (En-Sq), yielding more than +8.3 BLEU points for all four translation directions.

2019

pdf bib
The LMU Munich Unsupervised Machine Translation System for WMT19
Dario Stojanovski | Viktor Hangya | Matthias Huck | Alexander Fraser
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

We describe LMU Munich’s machine translation system for German→Czech translation which was used to participate in the WMT19 shared task on unsupervised news translation. We train our model using monolingual data only from both languages. The final model is an unsupervised neural model using established techniques for unsupervised translation such as denoising autoencoding and online back-translation. We bootstrap the model with masked language model pretraining and enhance it with back-translations from an unsupervised phrase-based system which is itself bootstrapped using unsupervised bilingual word embeddings.

pdf bib
Combining Local and Document-Level Context: The LMU Munich Neural Machine Translation System at WMT19
Dario Stojanovski | Alexander Fraser
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

We describe LMU Munich’s machine translation system for English→German translation which was used to participate in the WMT19 shared task on supervised news translation. We specifically participated in the document-level MT track. The system used as a primary submission is a context-aware Transformer capable of both rich modeling of limited contextual information and integration of large-scale document-level context with a less rich representation. We train this model by fine-tuning a big Transformer baseline. Our experimental results show that document-level context provides for large improvements in translation quality, and adding a rich representation of the previous sentence provides a small additional gain.

pdf bib
Improving Anaphora Resolution in Neural Machine Translation Using Curriculum Learning
Dario Stojanovski | Alexander Fraser
Proceedings of Machine Translation Summit XVII Volume 1: Research Track

2018

pdf bib
Coreference and Coherence in Neural Machine Translation: A Study Using Oracle Experiments
Dario Stojanovski | Alexander Fraser
Proceedings of the Third Conference on Machine Translation: Research Papers

Cross-sentence context can provide valuable information in Machine Translation and is critical for translation of anaphoric pronouns and for providing consistent translations. In this paper, we devise simple oracle experiments targeting coreference and coherence. Oracles are an easy way to evaluate the effect of different discourse-level phenomena in NMT using BLEU and eliminate the necessity to manually define challenge sets for this purpose. We propose two context-aware NMT models and compare them against models working on a concatenation of consecutive sentences. Concatenation models perform better, but are computationally expensive. We show that NMT models taking advantage of context oracle signals can achieve considerable gains in BLEU, of up to 7.02 BLEU for coreference and 1.89 BLEU for coherence on subtitles translation. Access to strong signals allows us to make clear comparisons between context-aware models.

pdf bib
The LMU Munich Unsupervised Machine Translation Systems
Dario Stojanovski | Viktor Hangya | Matthias Huck | Alexander Fraser
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

We describe LMU Munich’s unsupervised machine translation systems for English↔German translation. These systems were used to participate in the WMT18 news translation shared task and more specifically, for the unsupervised learning sub-track. The systems are trained on English and German monolingual data only and exploit and combine previously proposed techniques such as using word-by-word translated data based on bilingual word embeddings, denoising and on-the-fly backtranslation.

pdf bib
LMU Munich’s Neural Machine Translation Systems at WMT 2018
Matthias Huck | Dario Stojanovski | Viktor Hangya | Alexander Fraser
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

We present the LMU Munich machine translation systems for the English–German language pair. We have built neural machine translation systems for both translation directions (English→German and German→English) and for two different domains (the biomedical domain and the news domain). The systems were used for our participation in the WMT18 biomedical translation task and in the shared task on machine translation of news. The main focus of our recent system development efforts has been on achieving improvements in the biomedical domain over last year’s strong biomedical translation engine for English→German (Huck et al., 2017a). Considerable progress has been made in the latter task, which we report on in this paper.

2016

pdf bib
Finki at SemEval-2016 Task 4: Deep Learning Architecture for Twitter Sentiment Analysis
Dario Stojanovski | Gjorgji Strezoski | Gjorgji Madjarov | Ivica Dimitrovski
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)