Michael Carl


2019

pdf bib
Proceedings of the Second MEMENTO workshop on Modelling Parameters of Cognitive Effort in Translation Production
Michael Carl | Silvia Hansen-Schirra
Proceedings of the Second MEMENTO workshop on Modelling Parameters of Cognitive Effort in Translation Production

pdf bib
Lexical Representation & Retrieval on Monolingual Interpretative text production
Debasish Sahoo | Michael Carl
Proceedings of the Second MEMENTO workshop on Modelling Parameters of Cognitive Effort in Translation Production

2018

pdf bib
Literality and cognitive effort: Japanese and Spanish
Isabel Lacruz | Michael Carl | Masaru Yamada
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Experiments in Non-Coherent Post-editing
Cristina Toledo Báez | Moritz Schaeffer | Michael Carl
Proceedings of the Workshop Human-Informed Translation and Interpreting Technology

Market pressure on translation productivity joined with technological innovation is likely to fragment and decontextualise translation jobs even more than is cur-rently the case. Many different translators increasingly work on one document at different places, collaboratively working in the cloud. This paper investigates the effect of decontextualised source texts on behaviour by comparing post-editing of sequentially ordered sentences with shuffled sentences from two different texts. The findings suggest that there is little or no effect of the decontextualised source texts on behaviour.

2016

pdf bib
English-to-Japanese Translation vs. Dictation vs. Post-editing: Comparing Translation Modes in a Multilingual Setting
Michael Carl | Akiko Aizawa | Masaru Yamada
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Speech-enabled interfaces have the potential to become one of the most efficient and ergonomic environments for human-computer interaction and for text production. However, not much research has been carried out to investigate in detail the processes and strategies involved in the different modes of text production. This paper introduces and evaluates a corpus of more than 55 hours of English-to-Japanese user activity data that were collected within the ENJA15 project, in which translators were observed while writing and speaking translations (translation dictation) and during machine translation post-editing. The transcription of the spoken data, keyboard logging and eye-tracking data were recorded with Translog-II, post-processed and integrated into the CRITT Translation Process Research-DB (TPR-DB), which is publicly available under a creative commons license. The paper presents the ENJA15 data as part of a large multilingual Chinese, Danish, German, Hindi and Spanish translation process data collection of more than 760 translation sessions. It compares the ENJA15 data with the other language pairs and reviews some of its particularities.

pdf bib
Measuring Cognitive Translation Effort with Activity Units
Moritz Jonas Schaeffer | Michael Carl | Isabel Lacruz | Akiko Aizawa
Proceedings of the 19th Annual Conference of the European Association for Machine Translation

2014

pdf bib
CASMACAT: A Computer-assisted Translation Workbench
Vicent Alabau | Christian Buck | Michael Carl | Francisco Casacuberta | Mercedes García-Martínez | Ulrich Germann | Jesús González-Rubio | Robin Hill | Philipp Koehn | Luis Leiva | Bartolomé Mesa-Lao | Daniel Ortiz-Martínez | Herve Saint-Amand | Germán Sanchis Trilles | Chara Tsoukala
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation
Ulrich Germann | Michael Carl | Philipp Koehn | Germán Sanchis-Trilles | Francisco Casacuberta | Robin Hill | Sharon O’Brien
Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation

pdf bib
Measuring the Cognitive Effort of Literal Translation Processes
Moritz Schaeffer | Michael Carl
Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation

pdf bib
CASMACAT: cognitive analysis and statistical methods for advanced computer aided translation
Philipp Koehn | Michael Carl | Francisco Casacuberta | Eva Marcos
Proceedings of the 17th Annual conference of the European Association for Machine Translation

pdf bib
SEECAT: ASR & Eye-tracking enabled computer-assisted translation
Mercedes García-Martínez | Karan Singla | Aniruddha Tammewar | Bartolomé Mesa-Lao | Ankita Thakur | Anusuya M.A. | Srinivas Bangalore | Michael Carl
Proceedings of the 17th Annual conference of the European Association for Machine Translation

pdf bib
CFT13: A resource for research into the post-editing process
Michael Carl | Mercedes Martínez García | Bartolomé Mesa-Lao
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper describes the most recent dataset that has been added to the CRITT Translation Process Research Database (TPR-DB). Under the name CFT13, this new study contains user activity data (UAD) in the form of key-logging and eye-tracking collected during the second CasMaCat field trial in June 2013. The CFT13 is a publicly available resource featuring a number of simple and compound process and product units suited to investigate human-computer interaction while post-editing machine translation outputs.

pdf bib
Evaluating the effects of interactivity in a post-editing workbench
Nancy Underwood | Bartolomé Mesa-Lao | Mercedes García Martínez | Michael Carl | Vicent Alabau | Jesús González-Rubio | Luis A. Leiva | Germán Sanchis-Trilles | Daniel Ortíz-Martínez | Francisco Casacuberta
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper describes the field trial and subsequent evaluation of a post-editing workbench which is currently under development in the EU-funded CasMaCat project. Based on user evaluations of the initial prototype of the workbench, this second prototype of the workbench includes a number of interactive features designed to improve productivity and user satisfaction. Using CasMaCat’s own facilities for logging keystrokes and eye tracking, data were collected from nine post-editors in a professional setting. These data were then used to investigate the effects of the interactive features on productivity, quality, user satisfaction and cognitive load as reflected in the post-editors’ gaze activity. These quantitative results are combined with the qualitative results derived from user questionnaires and interviews conducted with all the participants.

2013

pdf bib
Automatically Predicting Sentence Translation Difficulty
Abhijit Mishra | Pushpak Bhattacharyya | Michael Carl
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf bib
Translog-II: a Program for Recording User Activity Data for Empirical Reading and Writing Research
Michael Carl
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper presents a novel implementation of Translog-II. Translog-II is a Windows-oriented program to record and study reading and writing processes on a computer. In our research, it is an instrument to acquire objective, digital data of human translation processes. As their predecessors, Translog 2000 and Translog 2006, also Translog-II consists of two main components: Translog-II Supervisor and Translog-II User, which are used to create a project file, to run a text production experiments (a user reads, writes or translates a text) and to replay the session. Translog produces a log files which contains all user activity data of the reading, writing, or translation session, and which can be evaluated by external tools. While there is a large body of translation process research based on Translog, this paper gives an overview of the Translog-II functions and its data visualization options.

pdf bib
Proceedings of the First Workshop on Eye-tracking and Natural Language Processing
Michael Carl | Pushpak Bhattacharyya | Kamal Kumar Choudhary
Proceedings of the First Workshop on Eye-tracking and Natural Language Processing

pdf bib
A heuristic-based approach for systematic error correction of gaze data for reading
Abhijit Mishra | Michael Carl | Pushpak Bhattacharyya
Proceedings of the First Workshop on Eye-tracking and Natural Language Processing

2010

pdf bib
Correlating Translation Product and Translation Process Data of Professional and Student Translators
Michael Carl | Matthias Buch-Kromann
Proceedings of the 14th Annual conference of the European Association for Machine Translation

2008

pdf bib
Using Log-linear Models for Tuning Machine Translation Output
Michael Carl
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We describe a set of experiments to explore statistical techniques for ranking and selecting the best translations in a graph of translation hypotheses. In a previous paper (Carl, 2007) we have described how the graph of hypotheses is generated through shallow transfer and chunk permutation rules, where nodes consist of vectors representing morpho-syntactic properties of words and phrases. This paper describes a number of methods to train statistical feature functions from some of the vector’s components. The feature functions are trained off-line on different types of text and their log-linear combination is then used to retrieve the best translation paths in the graph. We compare two language modelling toolkits, the CMU and the SRI toolkit and arrive at three results: 1) models of lemma-based feature functions produce better results than token-based models, 2) adding PoS-tag feature function to the lemma models improves the output and 3) weights for lexical translations are suited if the training material is similar to the texts to be translated.

pdf bib
Evaluation of a Machine Translation System for Low Resource Languages: METIS-II
Vincent Vandeghinste | Peter Dirix | Ineke Schuurman | Stella Markantonatou | Sokratis Sofianopoulos | Marina Vassiliou | Olga Yannoutsou | Toni Badia | Maite Melero | Gemma Boleda | Michael Carl | Paul Schmidt
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we describe the METIS-II system and its evaluation on each of the language pairs: Dutch, German, Greek, and Spanish to English. The METIS-II system envisaged developing a data-driven approach in which no parallel corpus is required, and in which no full parser or extensive rule sets are needed. We describe evalution on a development test set and on a test set coming from Europarl, and compare our results with SYSTRAN. We also provide some further analysis, researching the impact of the number and source of the reference translations and analysing the results according to test text type. The results are expectably lower for the METIS system, but not at an unatainable distance from a mature system like SYSTRAN.

pdf bib
Modelling human translator behaviour with user-activity data
Michael Carl | Arnt Lykke Jakobsen | Kristian T.H. Jensen
Proceedings of the 12th Annual conference of the European Association for Machine Translation

2006

pdf bib
METIS-II: Machine Translation for Low Resource Languages
Vincent Vandeghinste | Ineke Schuurman | Michael Carl | Stella Markantonatou | Toni Badia
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper we describe a machine translation prototype in which we use only minimal resources for both the source and the target language. A shallow source language analysis, combined with a translation dictionary and a mapping system of source language phenomena into the target language and a target language corpus for generation are all the resources needed in the described system. Several approaches are presented.

pdf bib
A Dictionary Lookup Strategy for Translating of Discontinuous Phrases
Michael Carl | Ecaterina Rascu
Proceedings of the 11th Annual conference of the European Association for Machine Translation

2005

pdf bib
Using template-grammars for shake & bake paraphrasing
Michael Carl | Ecaterina Rascu | Paul Schmidt
Proceedings of the 10th EAMT Conference: Practical applications of machine translation

2004

pdf bib
Controlling Gender Equality with Shallow NLP Techniques
Michael Carl | Sandrine Garnier | Johann Haller | Anne Altmayer | Bärbel Miemietz
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
Using Weighted Abduction to Align Term Variant Translations in Bilingual Texts
Michael Carl | Ecaterina Rascu | Johann Haller
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
Phrase-based Evaluation of Word-to-Word Alignments
Michael Carl | Sisay Fissaha
Proceedings of the HLT-NAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond

pdf bib
Data-assisted controlled translation
Michael Carl
EAMT Workshop: Improving MT through other language technology tools: resources and tools for building MT

pdf bib
Tuning general translation knowledge to a sublanguage
Michael Carl | Philippe Langlais
EAMT Workshop: Improving MT through other language technology tools: resources and tools for building MT

2002

pdf bib
An Intelligent Terminology Database as a Pre-processor for Statistical Machine Translation
Michael Carl | Philippe Langlais
COLING-02: COMPUTERM 2002: Second International Workshop on Computational Terminology

2001

pdf bib
Inducing probabilistic invertible translation grammars from aligned texts
Michael Carl
Proceedings of the ACL 2001 Workshop on Computational Natural Language Learning (ConLL)

2000

pdf bib
A Model of Competence for Corpus-Based Machine Translation
Michael Carl
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics

1998

pdf bib
A Constructivist Approach to Machine Translation
Michael Carl
New Methods in Language Processing and Computational Natural Language Learning

pdf bib
Shallow Post Morphological Processing with KURD
Michael Carl | Antje Schmidt-Wigger
New Methods in Language Processing and Computational Natural Language Learning