Jean Veronis

Also published as: Jean Véronis


2006

pdf bib
Evaluation of multilingual text alignment systems: the ARCADE II project
Yun-Chuang Chiao | Olivier Kraif | Dominique Laurent | Thi Minh Huyen Nguyen | Nasredine Semmar | François Stuck | Jean Véronis | Wajdi Zaghouani
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper describes the ARCADE II project, concerned with the evaluation of parallel text alignment systems. The ARCADE II project aims at exploring the techniques of multilingual text alignment through a fine evaluation of the existing techniques and the development of new alignment methods. The evaluation campaign consists of two tracks devoted to the evaluation of alignment at sentence and word level respectively. It differs from ARCADE I in the multilingual aspect and the investigation of lexical alignment.

2004

pdf bib
The C-ORAL-ROM CORPUS. A Multilingual Resource of Spontaneous Speech for Romance Languages
Emanuela Cresti | Fernanda Bacelar do Nascimento | Antonio Moreno Sandoval | Jean Veronis | Philippe Martin | Khalid Choukri
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

The C-ORAL-ROM project has delivered a multilingual corpus of spontaneous speech for the main romance languages (Italian, French, Portuguese and Spanish). The collection aims to represent the variety of speech acts performed in everyday language and to enable the description of prosodic and syntactic structures in the four romance languages. Sampling criteria are defined in a corpus design scheme. C-ORAL-ROM adopts two different sampling strategies, one for the formal and one for the informal part: While a set of typical domains of application is selected to document the formal use of language, the informal part documents speech variation using parameters referring to the event’s structure (dialogue vs. monologue) and the sociological domain of use (family-private vs public). The four romance corpora are tagged with respect to terminal and non terminal prosodic breaks. Terminal breaks are assumed to be the more relevant cues for the identification of relevant linguistic domains in spontaneous speech (utterances). Relations with other concurrent criteria are discussed. The multimedia storage of the C-ORAL-ROM corpus is based on this principle; each textual string ending with a terminal break is aligned, through the Win Pitch speech software, to its acoustic counterpart, generating the data base of all utterances.

2002

pdf bib
The C-ORAL-ROM Project. New methods for spoken language archives in a multilingual romance corpus
Emanuela Cresti | Massimo Moneglia | Fernanda Bacelar do Nascimento | Antonio Moreno Sandoval | Jean Veronis | Philippe Martin | Kalid Choukri | Valerie Mapelli | Daniele Falavigna | Antonio Cid | Claude Blum
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

1998

pdf bib
Methods and Practical Issues in Evaluating Alignment Techniques
Philippe Langlais | Michel Simard | Jean Veronis
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

pdf bib
Methods and Practical Issues in Evaluating Alignment Techniques
Philippe Langlais | Michel Simard | Jean Veronis
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

pdf bib
Introduction to the Special Issue on Word Sense Disambiguation: The State of the Art
Nancy Ide | Jean Véronis
Computational Linguistics, Volume 24, Number 1, March 1998 - Special Issue on Word Sense Disambiguation

1994

pdf bib
MULTEXT: Multilingual Text Tools and Corpora
Nancy Ide | Jean Veronis
COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics

1993

bib
Knowledge extraction from machine-readable dictionaries: an evaluation
Nancy Ide | Jean Véronis
Third International EAMT Workshop: Machine Translation and the Lexicon

Machine-readable versions of everyday dictionaries have been seen as a likely source of information for use in natural language processing because they contain an enormous amount of lexical and semantic knowledge. However, after 15 years of research, the results appear to be disappointing. No comprehensive evaluation of machine-readable dictionaries (MRDs) as a knowledge source has been made to date, although this is necessary to determine what, if anything, can be gained from MRD research. To this end, this paper will first consider the postulates upon which MRD research has been based over the past fifteen years, discuss the validity of these postulates, and evaluate the results of this work. We will then propose possible future directions and applications that may exploit these years of effort, in the light of current directions in not only NLP research, but also fields such as lexicography and electronic publishing.

1992

pdf bib
Disjunctive Feature Structures as Hypergraphs
Jean Veronis
COLING 1992 Volume 2: The 15th International Conference on Computational Linguistics

pdf bib
A Feature-Based Model for Lexical Databases
Jean Veronis | Nancy Ide
COLING 1992 Volume 2: The 15th International Conference on Computational Linguistics

1991

pdf bib
An Assessment of Semantic Information Automatically Extracted From Machine Readable Dictionaries
Jean Veronis | Nancy Ide
Fifth Conference of the European Chapter of the Association for Computational Linguistics

1990

pdf bib
Word Sense Disambiguation with Very Large Neural Networks Extracted from Machine Readable Dictionaries
Jean Veronis | Nancy M. Ide
COLING 1990 Volume 2: Papers presented to the 13th International Conference on Computational Linguistics

1988

pdf bib
Morphosyntactic correction in natural language interfaces
Jean Veronis
Coling Budapest 1988 Volume 2: International Conference on Computational Linguistics