Gregor Thurmair


2015

pdf bib
Evaluation of the domain adaptation of MT systems in ACCURAT
Gregor Thurmair
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

pdf bib
Evaluation of the domain adaptation of MT systems in ACCURAT
Gregor Thurmair
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

2014

pdf bib
Conceptual transfer: Using local classifiers for transfer selection
Gregor Thurmair
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

A key challenge for Machine Translation is transfer selection, i.e. to find the right translation for a given word from a set of alternatives (1:n). This problem becomes the more important the larger the dictionary is, as the number of alternatives increases. The contribution presents a novel approach for transfer selection, called conceptual transfer, where selection is done using classifiers based on the conceptual context of a translation candidate on the source language side. Such classifiers are built automatically by parallel corpus analysis: Creating subcorpora for each translation of a 1:n package, and identifying correlating concepts in these subcorpora as features of the classifier. The resulting resource can easily be linked to transfer components of MT systems as it does not depend on internal analysis structures. Tests show that conceptual transfer outperforms the selection techniques currently used in operational MT systems.

2013

pdf bib
A modular open-source focused crawler for mining monolingual and bilingual corpora from the web
Vassilis Papavassiliou | Prokopis Prokopidis | Gregor Thurmair
Proceedings of the Sixth Workshop on Building and Using Comparable Corpora

2012

pdf bib
Efficiency-based evaluation of aligners for industrial applications
Antonio. Toral | Marc Poch | Pavel Pecina | Gregor Thurmair
Proceedings of the 16th Annual conference of the European Association for Machine Translation

pdf bib
EASTIN-CL: A multilingual front-end to a database of Assistive Technology products
Gregor Thurmair | Andrea Agnoletto | Valerio Gower | Roberts Rozis
Proceedings of the 16th Annual conference of the European Association for Machine Translation

pdf bib
Creating Term and Lexicon Entries from Phrase Tables
Gregor Thurmair | Vera Aleksić
Proceedings of the 16th Annual conference of the European Association for Machine Translation

pdf bib
Large Scale Lexical Analysis
Gregor Thurmair | Vera Aleksić | Christoph Schwarz
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The following paper presents a lexical analysis component as implemented in the PANACEA project. The goal is to automatically extract lexicon entries from crawled corpora, in an attempt to use corpus-based methods for high-quality linguistic text processing, and to focus on the quality of data without neglecting quantitative aspects. Lexical analysis has the task to assign linguistic information (like: part of speech, inflectional class, gender, subcategorisation frame, semantic properties etc.) to all parts of the input text. If tokens are ambiguous, lexical analysis must provide all possible sets of annotation for later (syntactic) disambiguation, be it tagging, or full parsing. The paper presents an approach for assigning part-of-speech tags for German and English to large input corpora (> 50 mio tokens), providing a workflow which takes as input crawled corpora and provides POS-tagged lemmata ready for lexicon integration. Tools include sentence splitting, lexicon lookup, decomposition, and POS defaulting. Evaluation shows that the overall error rate can be brought down to about 2% if language resources are properly designed. The complete workflow is implemented as a sequence of web services integrated into the PANACEA platform.

2011

pdf bib
Personal Translator at WMT2011
Vera Aleksić | Gregor Thurmair
Proceedings of the Sixth Workshop on Statistical Machine Translation

2004

pdf bib
Multilingual Content Processing
Gregor Thurmair
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
Making term extraction tools usable
Gregor Thurmair
EAMT Workshop: Improving MT through other language technology tools: resources and tools for building MT

2002

pdf bib
From Resources to Applications. Designing the Multilingual ISLE Lexical Entry
Sue Atkins | Nuria Bel | Francesca Bertagna | Pierrette Bouillon | Nicoletta Calzolari | Christiane Fellbaum | Ralph Grishman | Alessandro Lenci | Catherine MacLeod | Martha Palmer | Gregor Thurmair | Marta Villegas | Antonio Zampolli
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

1999

bib
The L&H approach to development of tools for new languages
Gregor Thurmair | Johannes Ritzke
EAMT Workshop: EU and the new languages

1990

pdf bib
Parsing for Grammar and Style Checking
Gregor Thurmair
COLING 1990 Volume 2: Papers presented to the 13th International Conference on Computational Linguistics