Meriama Laib

Also published as: Mariama Laib, Meriama Laïb


2017

pdf bib
Building Multiword Expressions Bilingual Lexicons for Domain Adaptation of an Example-Based Machine Translation System
Nasredine Semmar | Mariama Laib
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

We describe in this paper a hybrid ap-proach to build automatically bilingual lexicons of Multiword Expressions (MWEs) from parallel corpora. We more specifically investigate the impact of using a domain-specific bilingual lexicon of MWEs on domain adaptation of an Example-Based Machine Translation (EBMT) system. We conducted experiments on the English-French language pair and two kinds of texts: in-domain texts from Europarl (European Parliament proceedings) and out-of-domain texts from Emea (European Medicines Agency documents) and Ecb (European Central Bank corpus). The obtained results indicate that integrating domain-specific bilingual lexicons of MWEs improves translation quality of the EBMT system when texts to translate are related to the specific domain and induces a relatively slight deterioration of translation quality when translating general-purpose texts.

2016

pdf bib
Etude de l’impact d’un lexique bilingue spécialisé sur la performance d’un moteur de traduction à base d’exemples (Studying the impact of a specialized bilingual lexicon on the performance of an example-based machine translation engine)
Nasredine Semmar | Othman Zennaki | Meriama Laib
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 2 : TALN (Articles longs)

La traduction automatique statistique bien que performante est aujourd’hui limitée parce qu’elle nécessite de gros volumes de corpus parallèles qui n’existent pas pour tous les couples de langues et toutes les spécialités et que leur production est lente et coûteuse. Nous présentons, dans cet article, un prototype d’un moteur de traduction à base d’exemples utilisant la recherche d’information interlingue et ne nécessitant qu’un corpus de textes en langue cible. Plus particulièrement, nous proposons d’étudier l’impact d’un lexique bilingue de spécialité sur la performance de ce prototype. Nous évaluons ce prototype de traduction et comparons ses résultats à ceux du système de traduction statistique Moses en utilisant les corpus parallèles anglais-français Europarl (European Parliament Proceedings) et Emea (European Medicines Agency Documents). Les résultats obtenus montrent que le score BLEU du prototype du moteur de traduction à base d’exemples est proche de celui du système Moses sur des documents issus du corpus Europarl et meilleur sur des documents extraits du corpus Emea.

2015

pdf bib
Evaluating the Impact of Using a Domain-specific Bilingual Lexicon on the Performance of a Hybrid Machine Translation Approach
Nasredine Semmar | Othman Zennaki | Meriama Laib
Proceedings of the International Conference Recent Advances in Natural Language Processing

pdf bib
Improving the Performance of an Example-Based Machine Translation System Using a Domain-specific Bilingual Lexicon
Nasredine Semmar | Othman Zennaki | Meriama Laib
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation: Posters

2010

pdf bib
LIMA : A Multilingual Framework for Linguistic Analysis and Linguistic Resources Development and Evaluation
Romaric Besançon | Gaël de Chalendar | Olivier Ferret | Faiza Gara | Olivier Mesnard | Meriama Laïb | Nasredine Semmar
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

The increasing amount of available textual information makes necessary the use of Natural Language Processing (NLP) tools. These tools have to be used on large collections of documents in different languages. But NLP is a complex task that relies on many processes and resources. As a consequence, NLP tools must be both configurable and efficient: specific software architectures must be designed for this purpose. We present in this paper the LIMA multilingual analysis platform, developed at CEA LIST. This configurable platform has been designed to develop NLP based industrial applications while keeping enough flexibility to integrate various processes and resources. This design makes LIMA a linguistic analyzer that can handle languages as different as French, English, German, Arabic or Chinese. Beyond its architecture principles and its capabilities as a linguistic analyzer, LIMA also offers a set of tools dedicated to the test and the evaluation of linguistic modules and to the production and the management of new linguistic resources.

2006

pdf bib
A Deep Linguistic Analysis for Cross-language Information Retrieval
Nasredine Semmar | Meriama Laib | Christian Fluhr
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Cross-language information retrieval consists in providing a query in one language and searching documents in one or different languages. These documents are ordered by the probability of being relevant to the user's request. The highest ranked document is considered to be the most likely relevant document. The LIC2M cross-language information retrieval system is a weighted Boolean search engine based on a deep linguistic analysis of the query and the documents. This system is composed of a linguistic analyzer, a statistic analyzer, a reformulator, a comparator and a search engine. The linguistic analysis processes both documents to be indexed and queries to extract concepts representing their content. This analysis includes a morphological analysis, a part-of-speech tagging and a syntactic analysis. In this paper, we present the deep linguistic analysis used in the LIC2M cross-lingual search engine and we will particularly focus on the impact of the syntactic analysis on the retrieval effectiveness.