Mariona Taulé

Also published as: M. Taulé, Mariona Taule


2016

pdf bib
Problematic Cases in the Annotation of Negation in Spanish
Salud María Jiménez-Zafra | Maite Martin | L. Alfonso Ureña-López | Toni Martí | Mariona Taulé
Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics (ExProM)

This paper presents the main sources of disagreement found during the annotation of the Spanish SFU Review Corpus with negation (SFU ReviewSP -NEG). Negation detection is a challenge in most of the task related to NLP, so the availability of corpora annotated with this phenomenon is essential in order to advance in tasks related to this area. A thorough analysis of the problems found during the annotation could help in the study of this phenomenon.

2012

pdf bib
Empirical Methods for the Study of Denotation in Nominalizations in Spanish
Aina Peris | Mariona Taulé | Horacio Rodríguez
Computational Linguistics, Volume 38, Issue 4 - December 2012

2010

pdf bib
SemEval-2010 Task 1: Coreference Resolution in Multiple Languages
Marta Recasens | Lluís Màrquez | Emili Sapena | M. Antònia Martí | Mariona Taulé | Véronique Hoste | Massimo Poesio | Yannick Versley
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
ADN-Classifier:Automatically Assigning Denotation Types to Nominalizations
Aina Peris | Mariona Taulé | Gemma Boleda | Horacio Rodríguez
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper presents the ADN-Classifier, an Automatic classification system of Spanish Deverbal Nominalizations aimed at identifying its semantic denotation (i.e. event, result, underspecified, or lexicalized). The classifier can be used for NLP tasks such as coreference resolution or paraphrase detection. To our knowledge, the ADN-Classifier is the first effort in acquisition of denotations for nominalizations using Machine Learning.We compare the results of the classifier when using a decreasing number of Knowledge Sources, namely (1) the complete nominal lexicon (AnCora-Nom) that includes sense distictions, (2) the nominal lexicon (AnCora-Nom) removing the sense-specific information, (3) nominalizations’ context information obtained from a treebank corpus (AnCora-Es) and (4) the combination of the previous linguistic resources. In a realistic scenario, that is, without sense distinction, the best results achieved are those taking into account the information declared in the lexicon (89.40% accuracy). This shows that the lexicon contains crucial information (such as argument structure) that corpus-derived features cannot substitute for.

2009

pdf bib
SemEval-2010 Task 1: Coreference Resolution in Multiple Languages
Marta Recasens | Toni Martí | Mariona Taulé | Lluís Màrquez | Emili Sapena
Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions (SEW-2009)

2008

pdf bib
AnCora: Multilevel Annotated Corpora for Catalan and Spanish
Mariona Taulé | M. Antònia Martí | Marta Recasens
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper presents AnCora, a multilingual corpus annotated at different linguistic levels consisting of 500,000 words in Catalan (AnCora-Ca) and in Spanish (AnCora-Es). At present AnCora is the largest multilayer annotated corpus of these languages freely available from http://clic.ub.edu/ancora. The two corpora consist mainly of newspaper texts annotated at different levels of linguistic description: morphological (PoS and lemmas), syntactic (constituents and functions), and semantic (argument structures, thematic roles, semantic verb classes, named entities, and WordNet nominal senses). All resulting layers are independent of each other, thus making easier the data management. The annotation was performed manually, semiautomatically, or fully automatically, depending on the encoded linguistic information. The development of these basic resources constituted a primary objective, since there was a lack of such resources for these languages. A second goal was the definition of a consistent methodology that can be followed in further annotations. The current versions of AnCora have been used in several international evaluation competitions

pdf bib
AnCora-Verb: A Lexical Resource for the Semantic Annotation of Corpora
Juan Aparicio | Mariona Taulé | M. Antònia Martí
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we present two large-scale verbal lexicons, AnCora-Verb-Ca for Catalan and AnCora-Verb-Es for Spanish, which are the basis for the semantic annotation with arguments and thematic roles of AnCora corpora. In AnCora-Verb lexicons, the mapping between syntactic functions, arguments and thematic roles of each verbal predicate it is established taking into account the verbal semantic class and the diatheses alternations in which the predicate can participate. Each verbal predicate is related to one or more semantic classes basically differentiated according to the four event classes -accomplishments, achievements, states and activities-, and on the diatheses alternations in which a verb can occur. AnCora-Verb-Es contains a total of 1,965 different verbs corresponding to 3,671 senses and AnCora-Verb-Ca contains 2,151 verbs and 4,513 senses. These figures correspond to the total of 500,000 words contained in each corpus, AnCora-Ca and AnCora-Es. The lexicons and the annotated corpora constitute the richest linguistic resources of this kind freely available for Spanish and Catalan. The big amount of linguistic information contained in both resources should be of great interest for computational applications and linguistic studies. Currently, a consulting interface for these lexicons is available at (http://clic.ub.edu/ancora/).

2007

pdf bib
SemEval-2007 Task 09: Multilevel Semantic Annotation of Catalan and Spanish
Lluís Màrquez | Luis Villarejo | M. A. Martí | Mariona Taulé
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

2004

pdf bib
MiniCors and Cast3LB: Two Semantically Tagged Spanish Corpora
M. Taulé | M. Civit | N. Artigas | M. García | L. Màrquez | M.A. Martí | B. Navarro
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Senseval-3: The Spanish lexical sample task
Lluis Màrquez | Mariona Taulé | Antonia Martí | Núria Artigas | Mar García | Francis Real | Dani Ferrés
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text

pdf bib
Senseval-3: The Catalan lexical sample task
Lluis Màrquez | Mariona Taulé | Antonia Martí | Mar García | Francis Real | Dani Ferrés
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text

2001

pdf bib
Framework and Results for the Spanish SENSEVAL
German Rigau | Mariona Taulé | Ana Fernandez | Julio Gonzalo
Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems

1992

pdf bib
SEISD: An environment for extraction of Semantic Information from on-line dictionaries
Alicia Ageno | Irene Castellon | M. A. Marti | German Rigau | Francesc Ribas | Horacio Rodriguez | Mariona Taule | Felisa Verdejo
Third Conference on Applied Natural Language Processing