Maxim Khalilov


2020

pdf bib
A Post-Editing Dataset in the Legal Domain: Do we Underestimate Neural Machine Translation Quality?
Julia Ive | Lucia Specia | Sara Szoc | Tom Vanallemeersch | Joachim Van den Bogaert | Eduardo Farah | Christine Maroti | Artur Ventura | Maxim Khalilov
Proceedings of the 12th Language Resources and Evaluation Conference

We introduce a machine translation dataset for three pairs of languages in the legal domain with post-edited high-quality neural machine translation and independent human references. The data was collected as part of the EU APE-QUEST project and comprises crawled content from EU websites with translation from English into three European languages: Dutch, French and Portuguese. Altogether, the data consists of around 31K tuples including a source sentence, the respective machine translation by a neural machine translation system, a post-edited version of such translation by a professional translator, and - where available - the original reference translation crawled from parallel language websites. We describe the data collection process, provide an analysis of the resulting post-edits and benchmark the data using state-of-the-art quality estimation and automatic post-editing models. One interesting by-product of our post-editing analysis suggests that neural systems built with publicly available general domain data can provide high-quality translations, even though comparison to human references suggests that this quality is quite low. This makes our dataset a suitable candidate to test evaluation metrics. The data is freely available as an ELRC-SHARE resource.

2019

pdf bib
APE-QUEST
Joachim Van den Bogaert | Heidi Depraetere | Sara Szoc | Tom Vanallemeersch | Koen Van Winckel | Frederic Everaert | Lucia Specia | Julia Ive | Maxim Khalilov | Christine Maroti | Eduardo Farah | Artur Ventura
Proceedings of Machine Translation Summit XVII Volume 2: Translator, Project and User Tracks

2018

pdf bib
Machine translation at Booking.com: what’s next?
Maxim Khalilov
Proceedings of the AMTA 2018 Workshop on Translation Quality Estimation and Automatic Post-Editing

2014

pdf bib
Machine translation for LSPs: strategy and implementation
Maxim Khalilov
Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra)

2013

pdf bib
English-to-Russian MT evaluation campaign
Pavel Braslavski | Alexander Beloborodov | Maxim Khalilov | Serge Sharoff
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf bib
Building English-Chinese and Chinese-English MT engines for the computer software domain
Maxim Khalilov | Rahzeb Choudhury
Proceedings of the 16th Annual conference of the European Association for Machine Translation

2011

pdf bib
ILLC-UvA translation system for EMNLP-WMT 2011
Maxim Khalilov | Khalil Sima’an
Proceedings of the Sixth Workshop on Statistical Machine Translation

pdf bib
Context-Sensitive Syntactic Source-Reordering by Statistical Transduction
Maxim Khalilov | Khalil Sima’an
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib
Source reordering using MaxEnt classifiers and supertags
Maxim Khalilov | Khalil Sima’an
Proceedings of the 14th Annual conference of the European Association for Machine Translation

pdf bib
A Discriminative Syntactic Model for Source Permutation via Tree Transduction
Maxim Khalilov | Khalil Sima’an
Proceedings of the 4th Workshop on Syntax and Structure in Statistical Translation

pdf bib
Towards Improving English-Latvian Translation: A System Comparison and a New Rescoring Feature
Maxim Khalilov | José A. R. Fonollosa | Inguna Skadin̨a | Edgars Brālītis | Lauma Pretkalnin̨a
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Translation into the languages with relatively free word order has received a lot less attention than translation into fixed word order languages (English), or into analytical languages (Chinese). At the same time this translation task is found among the most difficult challenges for machine translation (MT), and intuitively it seems that there is some space in improvement intending to reflect the free word order structure of the target language. This paper presents a comparative study of two alternative approaches to statistical machine translation (SMT) and their application to a task of English-to-Latvian translation. Furthermore, a novel feature intending to reflect the relatively free word order scheme of the Latvian language is proposed and successfully applied on the n-best list rescoring step. Moving beyond classical automatic scores of translation quality that are classically presented in MT research papers, we contribute presenting a manual error analysis of MT systems output that helps to shed light on advantages and disadvantages of the SMT systems under consideration.

2009

pdf bib
The TALP-UPC Phrase-Based Translation System for EACL-WMT 2009
José A. R. Fonollosa | Maxim Khalilov | Marta R. Costa-jussà | José B. Mariño | Carlos A. Henráquez Q. | Adolfo Hernández H. | Rafael E. Banchs
Proceedings of the Fourth Workshop on Statistical Machine Translation

pdf bib
Coupling Hierarchical Word Reordering and Decoding in Phrase-Based Statistical Machine Translation
Maxim Khalilov | José A. R. Fonollosa | Mark Dras
Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009

pdf bib
A New Subtree-Transfer Approach to Syntax-Based Reordering for Statistical Machine Translation
Maxim Khalilov | José A. R. Fonollosa | Mark Dras
Proceedings of the 13th Annual conference of the European Association for Machine Translation

pdf bib
N-Gram-Based Statistical Machine Translation versus Syntax Augmented Machine Translation: Comparison and System Combination
Maxim Khalilov | José A. R. Fonollosa
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

2008

pdf bib
The TALP-UPC Ngram-Based Statistical Machine Translation System for ACL-WMT 2008
Maxim Khalilov | Adolfo Hernández H. | Marta R. Costa-jussà | Josep M. Crego | Carlos A. Henríquez Q. | Patrik Lambert | José A. R. Fonollosa | José B. Mariño | Rafael E. Banchs
Proceedings of the Third Workshop on Statistical Machine Translation

2007

pdf bib
Ngram-Based Statistical Machine Translation Enhanced with Multiple Weighted Reordering Hypotheses
Marta R. Costa-jussà | Josep M. Crego | Patrik Lambert | Maxim Khalilov | José A. R. Fonollosa | José B. Mariño | Rafael E. Banchs
Proceedings of the Second Workshop on Statistical Machine Translation

2006

pdf bib
TALP Phrase-based statistical translation system for European language pairs
Marta R. Costa-jussà | Josep M. Crego | Adrià de Gispert | Patrik Lambert | Maxim Khalilov | José B. Mariño | José A. R. Fonollosa | Rafael Banchs
Proceedings on the Workshop on Statistical Machine Translation

pdf bib
N-gram-based SMT System Enhanced with Reordering Patterns
Josep M. Crego | Adrià de Gispert | Patrik Lambert | Marta R. Costa-jussà | Maxim Khalilov | Rafael Banchs | José B. Mariño | José A. R. Fonollosa
Proceedings on the Workshop on Statistical Machine Translation