Christian Federmann


2020

pdf bib
Assessing Human-Parity in Machine Translation on the Segment Level
Yvette Graham | Christian Federmann | Maria Eskevich | Barry Haddow
Findings of the Association for Computational Linguistics: EMNLP 2020

Recent machine translation shared tasks have shown top-performing systems to tie or in some cases even outperform human translation. Such conclusions about system and human performance are, however, based on estimates aggregated from scores collected over large test sets of translations and unfortunately leave some remaining questions unanswered. For instance, simply because a system significantly outperforms the human translator on average may not necessarily mean that it has done so for every translation in the test set. Firstly, are there remaining source segments present in evaluation test sets that cause significant challenges for top-performing systems and can such challenging segments go unnoticed due to the opacity of current human evaluation procedures? To provide insight into these issues we carefully inspect the outputs of top-performing systems in the most recent WMT-19 news translation shared task for all language pairs in which a system either tied or outperformed human translation. Our analysis provides a new method of identifying the remaining segments for which either machine or human perform poorly. For example, in our close inspection of WMT-19 English to German and German to English we discover the segments that disjointly proved a challenge for human and machine. For English to Russian, there were no segments included in our sample of translations that caused a significant challenge for the human translator, while we again identify the set of segments that caused issues for the top-performing system.

pdf bib
Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Michael Denkowski | Christian Federmann
Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)

pdf bib
FINDINGS OF THE IWSLT 2020 EVALUATION CAMPAIGN
Ebrahim Ansari | Amittai Axelrod | Nguyen Bach | Ondřej Bojar | Roldano Cattoni | Fahim Dalvi | Nadir Durrani | Marcello Federico | Christian Federmann | Jiatao Gu | Fei Huang | Kevin Knight | Xutai Ma | Ajay Nagesh | Matteo Negri | Jan Niehues | Juan Pino | Elizabeth Salesky | Xing Shi | Sebastian Stüker | Marco Turchi | Alexander Waibel | Changhan Wang
Proceedings of the 17th International Conference on Spoken Language Translation

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2020) featured this year six challenge tracks: (i) Simultaneous speech translation, (ii) Video speech translation, (iii) Offline speech translation, (iv) Conversational speech translation, (v) Open domain translation, and (vi) Non-native speech translation. A total of teams participated in at least one of the tracks. This paper introduces each track’s goal, data and evaluation metrics, and reports the results of the received submissions.

pdf bib
TICO-19: the Translation Initiative for COvid-19
Antonios Anastasopoulos | Alessandro Cattelan | Zi-Yi Dou | Marcello Federico | Christian Federmann | Dmitriy Genzel | Franscisco Guzmán | Junjie Hu | Macduff Hughes | Philipp Koehn | Rosie Lazar | Will Lewis | Graham Neubig | Mengmeng Niu | Alp Öktem | Eric Paquin | Grace Tang | Sylwia Tur
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

The COVID-19 pandemic is the worst pandemic to strike the world in over a century. Crucial to stemming the tide of the SARS-CoV-2 virus is communicating to vulnerable populations the means by which they can protect themselves. To this end, the collaborators forming the Translation Initiative for COvid-19 (TICO-19) have made test and development data available to AI and MT researchers in 35 different languages in order to foster the development of tools and resources for improving access to information about COVID-19 in these languages. In addition to 9 high-resourced, ”pivot” languages, the team is targeting 26 lesser resourced languages, in particular languages of Africa, South Asia and South-East Asia, whose populations may be the most vulnerable to the spread of the virus. The same data is translated into all of the languages represented, meaning that testing or development can be done for any pairing of languages in the set. Further, the team is converting the test and development data into translation memories (TMXs) that can be used by localizers from and to any of the languages.

2019

pdf bib
Multilingual Whispers: Generating Paraphrases with Translation
Christian Federmann | Oussama Elachqar | Chris Quirk
Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)

Naturally occurring paraphrase data, such as multiple news stories about the same event, is a useful but rare resource. This paper compares translation-based paraphrase gathering using human, automatic, or hybrid techniques to monolingual paraphrasing by experts and non-experts. We gather translations, paraphrases, and empirical human quality assessments of these approaches. Neural machine translation techniques, especially when pivoting through related languages, provide a relatively robust source of paraphrases with diversity comparable to expert human paraphrases. Surprisingly, human translators do not reliably outperform neural systems. The resulting data release will not only be a useful test set, but will also allow additional explorations in translation and paraphrase quality assessments and relationships.

pdf bib
Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | André Martins | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Marco Turchi | Karin Verspoor
Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)

pdf bib
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | André Martins | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Marco Turchi | Karin Verspoor
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

pdf bib
Findings of the 2019 Conference on Machine Translation (WMT19)
Loïc Barrault | Ondřej Bojar | Marta R. Costa-jussà | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Philipp Koehn | Shervin Malmasi | Christof Monz | Mathias Müller | Santanu Pal | Matt Post | Marcos Zampieri
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019. Participants were asked to build machine translation systems for any of 18 language pairs, to be evaluated on a test set of news stories. The main metric for this task is human judgment of translation quality. The task was also opened up to additional test suites to probe specific aspects of translation.

pdf bib
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | André Martins | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Marco Turchi | Karin Verspoor
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)

pdf bib
Findings of the WMT 2019 Shared Tasks on Quality Estimation
Erick Fonseca | Lisa Yankovskaya | André F. T. Martins | Mark Fishel | Christian Federmann
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)

We report the results of the WMT19 shared task on Quality Estimation, i.e. the task of predicting the quality of the output of machine translation systems given just the source text and the hypothesis translations. The task includes estimation at three granularity levels: word, sentence and document. A novel addition is evaluating sentence-level QE against human judgments: in other words, designing MT metrics that do not need a reference translation. This year we include three language pairs, produced solely by neural machine translation systems. Participating teams from eleven institutions submitted a variety of systems to different task variants and language pairs.

pdf bib
Findings of the WMT 2019 Shared Task on Automatic Post-Editing
Rajen Chatterjee | Christian Federmann | Matteo Negri | Marco Turchi
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)

We present the results from the 5th round of the WMT task on MT Automatic Post-Editing. The task consists in automatically correcting the output of a “black-box” machine translation system by learning from human corrections. Keeping the same general evaluation setting of the previous four rounds, this year we focused on two language pairs (English-German and English-Russian) and on domain-specific data (In-formation Technology). For both the language directions, MT outputs were produced by neural systems unknown to par-ticipants. Seven teams participated in the English-German task, with a total of 18 submitted runs. The evaluation, which was performed on the same test set used for the 2018 round, shows a slight progress in APE technology: 4 teams achieved better results than last year’s winning system, with improvements up to -0.78 TER and +1.23 BLEU points over the baseline. Two teams participated in theEnglish-Russian task submitting 2 runs each. On this new language direction, characterized by a higher quality of the original translations, the task proved to be particularly challenging. None of the submitted runs improved the very high results of the strong system used to produce the initial translations(16.16 TER, 76.20 BLEU).

2018

pdf bib
Appraise Evaluation Framework for Machine Translation
Christian Federmann
Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations

We present Appraise, an open-source framework for crowd-based annotation tasks, notably for evaluation of machine translation output. This is the software used to run the yearly evaluation campaigns for shared tasks at the WMT Conference on Machine Translation. It has also been used at IWSLT 2017 and, recently, to measure human parity for machine translation for Chinese to English news text. The demo will present the full end-to-end lifecycle of an Appraise evaluation campaign, from task creation to annotation and interpretation of results.

pdf bib
Proceedings of the Third Conference on Machine Translation: Research Papers
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Lucia Specia | Marco Turchi | Karin Verspoor
Proceedings of the Third Conference on Machine Translation: Research Papers

bib
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Matt Post | Lucia Specia | Marco Turchi | Karin Verspoor
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

pdf bib
Findings of the 2018 Conference on Machine Translation (WMT18)
Ondřej Bojar | Christian Federmann | Mark Fishel | Yvette Graham | Barry Haddow | Philipp Koehn | Christof Monz
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2018. Participants were asked to build machine translation systems for any of 7 language pairs in both directions, to be evaluated on a test set of news stories. The main metric for this task is human judgment of translation quality. This year, we also opened up the task to additional test sets to probe specific aspects of translation.

2017

pdf bib
Proceedings of the Second Conference on Machine Translation
Ondřej Bojar | Christian Buck | Rajen Chatterjee | Christian Federmann | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Julia Kreutzer
Proceedings of the Second Conference on Machine Translation

pdf bib
Findings of the 2017 Conference on Machine Translation (WMT17)
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Yvette Graham | Barry Haddow | Shujian Huang | Matthias Huck | Philipp Koehn | Qun Liu | Varvara Logacheva | Christof Monz | Matteo Negri | Matt Post | Raphael Rubino | Lucia Specia | Marco Turchi
Proceedings of the Second Conference on Machine Translation

2016

pdf bib
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers
Ondřej Bojar | Christian Buck | Rajen Chatterjee | Christian Federmann | Liane Guillou | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Aurélie Névéol | Mariana Neves | Pavel Pecina | Martin Popel | Philipp Koehn | Christof Monz | Matteo Negri | Matt Post | Lucia Specia | Karin Verspoor | Jörg Tiedemann | Marco Turchi
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers

bib
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
Ondřej Bojar | Christian Buck | Rajen Chatterjee | Christian Federmann | Liane Guillou | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Aurélie Névéol | Mariana Neves | Pavel Pecina | Martin Popel | Philipp Koehn | Christof Monz | Matteo Negri | Matt Post | Lucia Specia | Karin Verspoor | Jörg Tiedemann | Marco Turchi
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

pdf bib
Findings of the 2016 Conference on Machine Translation
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Yvette Graham | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Varvara Logacheva | Christof Monz | Matteo Negri | Aurélie Névéol | Mariana Neves | Martin Popel | Matt Post | Raphael Rubino | Carolina Scarton | Lucia Specia | Marco Turchi | Karin Verspoor | Marcos Zampieri
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

2015

pdf bib
Proceedings of the Tenth Workshop on Statistical Machine Translation
Ondřej Bojar | Rajan Chatterjee | Christian Federmann | Barry Haddow | Chris Hokamp | Matthias Huck | Varvara Logacheva | Pavel Pecina
Proceedings of the Tenth Workshop on Statistical Machine Translation

pdf bib
Findings of the 2015 Workshop on Statistical Machine Translation
Ondřej Bojar | Rajen Chatterjee | Christian Federmann | Barry Haddow | Matthias Huck | Chris Hokamp | Philipp Koehn | Varvara Logacheva | Christof Monz | Matteo Negri | Matt Post | Carolina Scarton | Lucia Specia | Marco Turchi
Proceedings of the Tenth Workshop on Statistical Machine Translation

2014

pdf bib
Proceedings of the Ninth Workshop on Statistical Machine Translation
Ondřej Bojar | Christian Buck | Christian Federmann | Barry Haddow | Philipp Koehn | Christof Monz | Matt Post | Lucia Specia
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
Findings of the 2014 Workshop on Statistical Machine Translation
Ondřej Bojar | Christian Buck | Christian Federmann | Barry Haddow | Philipp Koehn | Johannes Leveling | Christof Monz | Pavel Pecina | Matt Post | Herve Saint-Amand | Radu Soricut | Lucia Specia | Aleš Tamchyna
Proceedings of the Ninth Workshop on Statistical Machine Translation

2013

pdf bib
Findings of the 2013 Workshop on Statistical Machine Translation
Ondřej Bojar | Christian Buck | Chris Callison-Burch | Christian Federmann | Barry Haddow | Philipp Koehn | Christof Monz | Matt Post | Radu Soricut | Lucia Specia
Proceedings of the Eighth Workshop on Statistical Machine Translation

2012

pdf bib
Involving Language Professionals in the Evaluation of Machine Translation
Eleftherios Avramidis | Aljoscha Burchardt | Christian Federmann | Maja Popović | Cindy Tscherwinka | David Vilar
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Significant breakthroughs in machine translation only seem possible if human translators are taken into the loop. While automatic evaluation and scoring mechanisms such as BLEU have enabled the fast development of systems, it is not clear how systems can meet real-world (quality) requirements in industrial translation scenarios today. The taraXÜ project paves the way for wide usage of hybrid machine translation outputs through various feedback loops in system development. In a consortium of research and industry partners, the project integrates human translators into the development process for rating and post-editing of machine translation outputs thus collecting feedback for possible improvements.

pdf bib
A Richly Annotated, Multilingual Parallel Corpus for Hybrid Machine Translation
Eleftherios Avramidis | Marta R. Costa-jussà | Christian Federmann | Josef van Genabith | Maite Melero | Pavel Pecina
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In recent years, machine translation (MT) research has focused on investigating how hybrid machine translation as well as system combination approaches can be designed so that the resulting hybrid translations show an improvement over the individual “component” translations. As a first step towards achieving this objective we have developed a parallel corpus with source text and the corresponding translation output from a number of machine translation engines, annotated with metadata information, capturing aspects of the translation process performed by the different MT systems. This corpus aims to serve as a basic resource for further research on whether hybrid machine translation algorithms and system combination techniques can benefit from additional (linguistically motivated, decoding, and runtime) information provided by the different systems involved. In this paper, we describe the annotated corpus we have created. We provide an overview on the component MT systems and the XLIFF-based annotation format we have developed. We also report on first experiments with the ML4HMT corpus data.

pdf bib
META-SHARE v2: An Open Network of Repositories for Language Resources including Data and Tools
Christian Federmann | Ioanna Giannopoulou | Christian Girardi | Olivier Hamon | Dimitris Mavroeidis | Salvatore Minutoli | Marc Schröder
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We describe META-SHARE which aims at providing an open, distributed, secure, and interoperable infrastructure for the exchange of language resources, including both data and tools. The application has been designed and is developed as part of the T4ME Network of Excellence. We explain the underlying motivation for such a distributed repository for metadata storage and give a detailed overview on the META-SHARE application and its various components. This includes a discussion of the technical architecture of the system as well as a description of the component-based metadata schema format which has been developed in parallel. Development of the META-SHARE infrastructure adopts state-of-the-art technology and follows an open-source approach, allowing the general community to participate in the development process. The META-SHARE software package including full source code has been released to the public in March 2012. We look forward to present an up-to-date version of the META-SHARE software at the conference.

pdf bib
The ML4HMT Workshop on Optimising the Division of Labour in Hybrid Machine Translation
Christian Federmann | Eleftherios Avramidis | Marta R. Costa-jussà | Josef van Genabith | Maite Melero | Pavel Pecina
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We describe the “Shared Task on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid Machine Translation” (ML4HMT) which aims to foster research on improved system combination approaches for machine translation (MT). Participants of the challenge are requested to build hybrid translations by combining the output of several MT systems of different types. We first describe the ML4HMT corpus used in the shared task, then explain the XLIFF-based annotation format we have designed for it, and briefly summarize the participating systems. Using both automated metrics scores and extensive manual evaluation, we discuss the individual performance of the various systems. An interesting result from the shared task is the fact that we were able to observe different systems winning according to the automated metrics scores when compared to the results from the manual evaluation. We conclude by summarising the first edition of the challenge and by giving an outlook to future work.

pdf bib
Can Machine Learning Algorithms Improve Phrase Selection in Hybrid Machine Translation?
Christian Federmann
Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)

pdf bib
Machine Learning for Hybrid Machine Translation
Sabine Hunsicker | Yu Chen | Christian Federmann
Proceedings of the Seventh Workshop on Statistical Machine Translation

pdf bib
Using Domain-specific and Collaborative Resources for Term Translation
Mihael Arcan | Christian Federmann | Paul Buitelaar
Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation

pdf bib
Proceedings of the Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT
Josef van Genabith | Toni Badia | Christian Federmann | Maite Melero | Marta R. Costa-jussà | Tsuyoshi Okita
Proceedings of the Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT

pdf bib
System Combination Using Joint, Binarised Feature Vectors
Christian Federmann
Proceedings of the Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT

pdf bib
Results from the ML4HMT-12 Shared Task on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid Machine Translation
Christian Federmann | Tsuyoshi Okita | Maite Melero | Marta R. Costa-Jussa | Toni Badia | Josef van Genabith
Proceedings of the Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT

pdf bib
Experiments with Term Translation
Mihael Arcan | Christian Federmann | Paul Buitelaar
Proceedings of COLING 2012

2011

pdf bib
Stochastic Parse Tree Selection for an Existing RBMT System
Christian Federmann | Sabine Hunsicker
Proceedings of the Sixth Workshop on Statistical Machine Translation

pdf bib
From Statistical Term Extraction to Hybrid Machine Translation
Petra Wolf | Ulrike Bernardi | Christian Federmann | Sabine Hunsicker
Proceedings of the 15th Annual conference of the European Association for Machine Translation

2010

pdf bib
Further Experiments with Shallow Hybrid MT Systems
Christian Federmann | Andreas Eisele | Yu Chen | Sabine Hunsicker | Jia Xu | Hans Uszkoreit
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf bib
Appraise: An Open-Source Toolkit for Manual Phrase-Based Evaluation of Translations
Christian Federmann
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We describe a focused effort to investigate the performance of phrase-based, human evaluation of machine translation output achieving a high annotator agreement. We define phrase-based evaluation and describe the implementation of Appraise, a toolkit that supports the manual evaluation of machine translation results. Phrase ranking can be done using either a fine-grained six-way scoring scheme that allows to differentiate between ""much better"" and ""slightly better"", or a reduced subset of ranking choices. Afterwards we discuss kappa values for both scoring models from several experiments conducted with human annotators. Our results show that phrase-based evaluation can be used for fast evaluation obtaining significant agreement among annotators. The granularity of ranking choices should, however, not be too fine-grained as this seems to confuse annotators and thus reduces the overall agreement. The work reported in this paper confirms previous work in the field and illustrates that the usage of human evaluation in machine translation should be reconsidered. The Appraise toolkit is available as open-source and can be downloaded from the author's website.

pdf bib
Extraction, Merging, and Monitoring of Company Data from Heterogeneous Sources
Christian Federmann | Thierry Declerck
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We describe the implementation of an enterprise monitoring system that builds on an ontology-based information extraction (OBIE) component applied to heterogeneous data sources. The OBIE component consists of several IE modules - each extracting on a regular temporal basis a specific fraction of company data from a given data source - and a merging tool, which is used to aggregate all the extracted information about a company. The full set of information about companies, which is to be extracted and merged by the OBIE component, is given in the schema of a domain ontology, which is guiding the information extraction process. The monitoring system, in case it detects changes in the extracted and merged information on a company with respect to the actual state of the knowledge base of the underlying ontology, ensures the update of the population of the ontology. As we are using an ontology extended with temporal information, the system is able to assign time intervals to any of the object instances. Additionally, detected changes can be communicated to end-users, who can validate and possibly correct the resulting updates in the knowledge base.

2009

pdf bib
Combining Multi-Engine Translations with Moses
Yu Chen | Michael Jellinghaus | Andreas Eisele | Yi Zhang | Sabine Hunsicker | Silke Theison | Christian Federmann | Hans Uszkoreit
Proceedings of the Fourth Workshop on Statistical Machine Translation

pdf bib
Translation Combination using Factored Word Substitution
Christian Federmann | Silke Theison | Andreas Eisele | Hans Uszkoreit | Yu Chen | Michael Jellinghaus | Sabine Hunsicker
Proceedings of the Fourth Workshop on Statistical Machine Translation

2008

pdf bib
Extracting and Querying Relations in Scientific Papers on Language Technology
Ulrich Schäfer | Hans Uszkoreit | Christian Federmann | Torsten Marek | Yajing Zhang
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We describe methods for extracting interesting factual relations from scientific texts in computational linguistics and language technology taken from the ACL Anthology. We use a hybrid NLP architecture with shallow preprocessing for increased robustness and domain-specific, ontology-based named entity recognition, followed by a deep HPSG parser running the English Resource Grammar (ERG). The extracted relations in the MRS (minimal recursion semantics) format are simplified and generalized using WordNet. The resulting “quriples” are stored in a database from where they can be retrieved (again using abstraction methods) by relation-based search. The query interface is embedded in a web browser-based application we call the Scientist’s Workbench. It supports researchers in editing and online-searching scientific papers.

pdf bib
Using Moses to Integrate Multiple Rule-Based Machine Translation Engines into a Hybrid System
Andreas Eisele | Christian Federmann | Hervé Saint-Amand | Michael Jellinghaus | Teresa Herrmann | Yu Chen
Proceedings of the Third Workshop on Statistical Machine Translation

pdf bib
Hybrid machine translation architectures within and beyond the EuroMatrix project
Andreas Eisele | Christian Federmann | Hans Uszkoreit | Hervé Saint-Amand | Martin Kay | Michael Jellinghaus | Sabine Hunsicker | Teresa Herrmann | Yu Chen
Proceedings of the 12th Annual conference of the European Association for Machine Translation

2007

pdf bib
Multi-Engine Machine Translation with an Open-Source SMT Decoder
Yu Chen | Andreas Eisele | Christian Federmann | Eva Hasler | Michael Jellinghaus | Silke Theison
Proceedings of the Second Workshop on Statistical Machine Translation

Search
Co-authors