Laura Jehl


pdf bib
Learning Neural Sequence-to-Sequence Models from Weak Feedback with Bipolar Ramp Loss
Laura Jehl | Carolin Lawrence | Stefan Riezler
Transactions of the Association for Computational Linguistics, Volume 7

In many machine learning scenarios, supervision by gold labels is not available and conse quently neural models cannot be trained directly by maximum likelihood estimation. In a weak supervision scenario, metric-augmented objectives can be employed to assign feedback to model outputs, which can be used to extract a supervision signal for training. We present several objectives for two separate weakly supervised tasks, machine translation and semantic parsing. We show that objectives should actively discourage negative outputs in addition to promoting a surrogate gold structure. This notion of bipolarity is naturally present in ramp loss objectives, which we adapt to neural models. We show that bipolar ramp loss objectives outperform other non-bipolar ramp loss objectives and minimum risk training on both weakly supervised tasks, as well as on a supervised machine translation task. Additionally, we introduce a novel token-level ramp loss objective, which is able to outperform even the best sequence-level ramp loss on both weakly supervised tasks.


pdf bib
Document-Level Information as Side Constraints for Improved Neural Patent Translation
Laura Jehl | Stefan Riezler
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)


pdf bib
Learning to translate from graded and negative relevance information
Laura Jehl | Stefan Riezler
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We present an approach for learning to translate by exploiting cross-lingual link structure in multilingual document collections. We propose a new learning objective based on structured ramp loss, which learns from graded relevance, explicitly including negative relevance information. Our results on English German translation of Wikipedia entries show small, but significant, improvements of our method over an unadapted baseline, even when only a weak relevance signal is used. We also compare our method to monolingual language model adaptation and automatic pseudo-parallel data extraction and find small improvements even over these strong baselines.


pdf bib
Source-side Preordering for Translation using Logistic Regression and Depth-first Branch-and-Bound Search
Laura Jehl | Adrià de Gispert | Mark Hopkins | Bill Byrne
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics


pdf bib
Boosting Cross-Language Retrieval by Learning Bilingual Phrase Associations from Relevance Rankings
Artem Sokokov | Laura Jehl | Felix Hieber | Stefan Riezler
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Task Alternation in Parallel Sentence Retrieval for Twitter Translation
Felix Hieber | Laura Jehl | Stefan Riezler
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)


pdf bib
Twitter Translation using Translation-Based Cross-Lingual Retrieval
Laura Jehl | Felix Hieber | Stefan Riezler
Proceedings of the Seventh Workshop on Statistical Machine Translation