Mathias Creutz


2020

pdf bib
Paraphrase Generation and Evaluation on Colloquial-Style Sentences
Eetu Sjöblom | Mathias Creutz | Yves Scherrer
Proceedings of the 12th Language Resources and Evaluation Conference

In this paper, we investigate paraphrase generation in the colloquial domain. We use state-of-the-art neural machine translation models trained on the Opusparcus corpus to generate paraphrases in six languages: German, English, Finnish, French, Russian, and Swedish. We perform experiments to understand how data selection and filtering for diverse paraphrase pairs affects the generated paraphrases. We compare two different model architectures, an RNN and a Transformer model, and find that the Transformer does not generally outperform the RNN. We also conduct human evaluation on five of the six languages and compare the results to the automatic evaluation metrics BLEU and the recently proposed BERTScore. The results advance our understanding of the trade-offs between the quality and novelty of generated paraphrases, affected by the data selection method. In addition, our comparison of the evaluation methods shows that while BLEU correlates well with human judgments at the corpus level, BERTScore outperforms BLEU in both corpus and sentence-level evaluation.

pdf bib
A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation
Raúl Vázquez | Alessandro Raganato | Mathias Creutz | Jörg Tiedemann
Computational Linguistics, Volume 46, Issue 2 - June 2020

Neural machine translation has considerably improved the quality of automatic translations by learning good representations of input sentences. In this article, we explore a multilingual translation model capable of producing fixed-size sentence representations by incorporating an intermediate crosslingual shared layer, which we refer to as attention bridge. This layer exploits the semantics from each language and develops into a language-agnostic meaning representation that can be efficiently used for transfer learning. We systematically study the impact of the size of the attention bridge and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that there is no conflict between translation performance and the use of sentence representations in downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. Nevertheless, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. Similarly, we show that trainable downstream tasks benefit from multilingual models, whereas additional language signals do not improve performance in non-trainable benchmarks. This is an important insight that helps to properly design models for specific applications. Finally, we also include an in-depth analysis of the proposed attention bridge and its ability to encode linguistic properties. We carefully analyze the information that is captured by individual attention heads and identify interesting patterns that explain the performance of specific settings in linguistic probing tasks.

2019

pdf bib
An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation
Alessandro Raganato | Raúl Vázquez | Mathias Creutz | Jörg Tiedemann
Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

In this paper, we explore a multilingual translation model with a cross-lingually shared layer that can be used as fixed-size sentence representation in different downstream tasks. We systematically study the impact of the size of the shared layer and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that the performance in translation does correlate with trainable downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. On the other hand, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. We hypothesize that the training procedure on the downstream task enables the model to identify the encoded information that is useful for the specific task whereas non-trainable benchmarks can be confused by other types of information also encoded in the representation of a sentence.

pdf bib
Multilingual NMT with a Language-Independent Attention Bridge
Raúl Vázquez | Alessandro Raganato | Jörg Tiedemann | Mathias Creutz
Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

In this paper, we propose an architecture for machine translation (MT) capable of obtaining multilingual sentence representations by incorporating an intermediate attention bridge that is shared across all languages. We train the model with language-specific encoders and decoders that are connected through an inner-attention layer on the encoder side. The attention bridge exploits the semantics from each language for translation and develops into a language-agnostic meaning representation that can efficiently be used for transfer learning. We present a new framework for the efficient development of multilingual neural machine translation (NMT) using this model and scheduled training. We have tested the approach in a systematic way with a multi-parallel data set. The model achieves substantial improvements over strong bilingual models and performs well for zero-shot translation, which demonstrates its ability of abstraction and transfer learning.

pdf bib
Toward automatic improvement of language produced by non-native language learners
Mathias Creutz | Eetu Sjöblom
Proceedings of the 8th Workshop on NLP for Computer Assisted Language Learning

2018

pdf bib
Open Subtitles Paraphrase Corpus for Six Languages
Mathias Creutz
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Paraphrase Detection on Noisy Subtitles in Six Languages
Eetu Sjöblom | Mathias Creutz | Mikko Aulamo
Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text

We perform automatic paraphrase detection on subtitle data from the Opusparcus corpus comprising six European languages: German, English, Finnish, French, Russian, and Swedish. We train two types of supervised sentence embedding models: a word-averaging (WA) model and a gated recurrent averaging network (GRAN) model. We find out that GRAN outperforms WA and is more robust to noisy training data. Better results are obtained with more and noisier data than less and cleaner data. Additionally, we experiment on other datasets, without reaching the same level of performance, because of domain mismatch between training and test data.

2009

pdf bib
Web Augmentation of Language Models for Continuous Speech Recognition of SMS Text Messages
Mathias Creutz | Sami Virpioja | Anna Kovaleva
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

2008

pdf bib
Speech to speech machine translation: Biblical chatter from Finnish to English
David Ellis | Mathias Creutz | Timo Honkela | Mikko Kurimo
Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages

2007

pdf bib
Analysis of Morph-Based Speech Recognition and the Modeling of Out-of-Vocabulary Words Across Languages
Mathias Creutz | Teemu Hirsimäki | Mikko Kurimo | Antti Puurula | Janne Pylkkönen | Vesa Siivola | Matti Varjokallio | Ebru Arisoy | Murat Saraçlar | Andreas Stolcke
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

2004

pdf bib
Induction of a Simple Morphology for Highly-Inflecting Languages
Mathias Creutz | Krista Lagus
Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology

2003

pdf bib
Unsupervised Segmentation of Words Using Prior Distributions of Morph Length and Frequency
Mathias Creutz
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

2002

pdf bib
Unsupervised Discovery of Morphemes
Mathias Creutz | Krista Lagus
Proceedings of the ACL-02 Workshop on Morphological and Phonological Learning