Oier Lopez de Lacalle

Also published as: Oier López de Lacalle, Oier López de Lacalle, Oier Lopez de Lacalle


2020

pdf bib
Linguistic Appropriateness and Pedagogic Usefulness of Reading Comprehension Questions
Andrea Horbach | Itziar Aldabe | Marie Bexte | Oier Lopez de Lacalle | Montse Maritxalar
Proceedings of the 12th Language Resources and Evaluation Conference

Automatic generation of reading comprehension questions is a topic receiving growing interest in the NLP community, but there is currently no consensus on evaluation metrics and many approaches focus on linguistic quality only while ignoring the pedagogic value and appropriateness of questions. This paper overcomes such weaknesses by a new evaluation scheme where questions from the questionnaire are structured in a hierarchical way to avoid confronting human annotators with evaluation measures that do not make sense for a certain question. We show through an annotation study that our scheme can be applied, but that expert annotators with some level of expertise are needed. We also created and evaluated two new evaluation data sets from the biology domain for Basque and German, composed of questions written by people with an educational background, which will be publicly released. Results show that manually generated questions are in general both of higher linguistic as well as pedagogic quality and that among the human generated questions, teacher-generated ones tend to be most useful.

pdf bib
Domain Adapted Distant Supervision for Pedagogically Motivated Relation Extraction
Oscar Sainz | Oier Lopez de Lacalle | Itziar Aldabe | Montse Maritxalar
Proceedings of the 12th Language Resources and Evaluation Conference

In this paper we present a relation extraction system that given a text extracts pedagogically motivated relation types, as a previous step to obtaining a semantic representation of the text which will make possible to automatically generate questions for reading comprehension. The system maps pedagogically motivated relations with relations from ConceptNet and deploys Distant Supervision for relation extraction. We run a study on a subset of those relationships in order to analyse the viability of our approach. For that, we build a domain-specific relation extraction system and explore two relation extraction models: a state-of-the-art model based on transfer learning and a discrete feature based machine learning model. Experiments show that the neural model obtains better results in terms of F-score and we yield promising results on the subset of relations suitable for pedagogical purposes. We thus consider that distant supervision for relation extraction is a valid approach in our target domain, i.e. biology.

pdf bib
Detection of Reading Absorption in User-Generated Book Reviews: Resources Creation and Evaluation
Piroska Lendvai | Sándor Darányi | Christian Geng | Moniek Kuijpers | Oier Lopez de Lacalle | Jean-Christophe Mensonides | Simone Rebora | Uwe Reichel
Proceedings of the 12th Language Resources and Evaluation Conference

To detect how and when readers are experiencing engagement with a literary work, we bring together empirical literary studies and language technology via focusing on the affective state of absorption. The goal of our resource development is to enable the detection of different levels of reading absorption in millions of user-generated reviews hosted on social reading platforms. We present a corpus of social book reviews in English that we annotated with reading absorption categories. Based on these data, we performed supervised, sentence level, binary classification of the explicit presence vs. absence of the mental state of absorption. We compared the performances of classical machine learners where features comprised sentence representations obtained from a pretrained embedding model (Universal Sentence Encoder) vs. neural classifiers in which sentence embedding vector representations are adapted or fine-tuned while training for the absorption recognition task. We discuss the challenges in creating the labeled data as well as the possibilities for releasing a benchmark corpus.

2018

pdf bib
The risk of sub-optimal use of Open Source NLP Software: UKB is inadvertently state-of-the-art in knowledge-based WSD
Eneko Agirre | Oier López de Lacalle | Aitor Soroa
Proceedings of Workshop for NLP Open Source Software (NLP-OSS)

UKB is an open source collection of programs for performing, among other tasks, Knowledge-Based Word Sense Disambiguation (WSD). Since it was released in 2009 it has been often used out-of-the-box in sub-optimal settings. We show that nine years later it is the state-of-the-art on knowledge-based WSD. This case shows the pitfalls of releasing open source NLP software without optimal default settings and precise instructions for reproducibility.

2016

pdf bib
Word Sense-Aware Machine Translation: Including Senses as Contextual Features for Improved Translation Models
Steven Neale | Luís Gomes | Eneko Agirre | Oier Lopez de Lacalle | António Branco
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Although it is commonly assumed that word sense disambiguation (WSD) should help to improve lexical choice and improve the quality of machine translation systems, how to successfully integrate word senses into such systems remains an unanswered question. Some successful approaches have involved reformulating either WSD or the word senses it produces, but work on using traditional word senses to improve machine translation have met with limited success. In this paper, we build upon previous work that experimented on including word senses as contextual features in maxent-based translation models. Training on a large, open-domain corpus (Europarl), we demonstrate that this aproach yields significant improvements in machine translation from English to Portuguese.

pdf bib
Improving Translation Selection with Supersenses
Haiqing Tang | Deyi Xiong | Oier Lopez de Lacalle | Eneko Agirre
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Selecting appropriate translations for source words with multiple meanings still remains a challenge for statistical machine translation (SMT). One reason for this is that most SMT systems are not good at detecting the proper sense for a polysemic word when it appears in different contexts. In this paper, we adopt a supersense tagging method to annotate source words with coarse-grained ontological concepts. In order to enable the system to choose an appropriate translation for a word or phrase according to the annotated supersense of the word or phrase, we propose two translation models with supersense knowledge: a maximum entropy based model and a supersense embedding model. The effectiveness of our proposed models is validated on a large-scale English-to-Spanish translation task. Results indicate that our method can significantly improve translation quality via correctly conveying the meaning of the source language to the target language.

2015

pdf bib
Crowdsourced Word Sense Annotations and Difficult Words and Examples
Oier Lopez de Lacalle | Eneko Agirre
Proceedings of the 11th International Conference on Computational Semantics

pdf bib
Predicting word sense annotation agreement
Héctor Martínez Alonso | Anders Johannsen | Oier Lopez de Lacalle | Eneko Agirre
Proceedings of the First Workshop on Linking Computational Models of Lexical, Sentential and Discourse-level Semantics

pdf bib
Diamonds in the Rough: Event Extraction from Imperfect Microblog Data
Ander Intxaurrondo | Eneko Agirre | Oier Lopez de Lacalle | Mihai Surdeanu
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
A Methodology for Word Sense Disambiguation at 90% based on large-scale CrowdSourcing
Oier Lopez de Lacalle | Eneko Agirre
Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics

2014

pdf bib
Random Walks for Knowledge-Based Word Sense Disambiguation
Eneko Agirre | Oier López de Lacalle | Aitor Soroa
Computational Linguistics, Volume 40, Issue 1 - March 2014

2013

pdf bib
Unsupervised Relation Extraction with General Domain Knowledge
Oier Lopez de Lacalle | Mirella Lapata
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
EHU-ALM: Similarity-Feature Based Approach for Student Response Analysis
Itziar Aldabe | Montse Maritxalar | Oier Lopez de Lacalle
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

2012

pdf bib
Matching Cultural Heritage items to Wikipedia
Eneko Agirre | Ander Barrena | Oier Lopez de Lacalle | Aitor Soroa | Samuel Fernando | Mark Stevenson
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Digitised Cultural Heritage (CH) items usually have short descriptions and lack rich contextual information. Wikipedia articles, on the contrary, include in-depth descriptions and links to related articles, which motivate the enrichment of CH items with information from Wikipedia. In this paper we explore the feasibility of finding matching articles in Wikipedia for a given Cultural Heritage item. We manually annotated a random sample of items from Europeana, and performed a qualitative and quantitative study of the issues and problems that arise, showing that each kind of CH item is different and needs a nuanced definition of what ``matching article'' means. In addition, we test a well-known wikification (aka entity linking) algorithm on the task. Our results indicate that a substantial number of items can be effectively linked to their corresponding Wikipedia article.

pdf bib
Enabling the Discovery of Digital Cultural Heritage Objects through Wikipedia
Mark Michael Hall | Oier Lopez de Lacalle | Aitor Soroa Etxabe | Paul Clough | Eneko Agirre
Proceedings of the 6th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities

2010

pdf bib
SemEval-2010 Task 17: All-Words Word Sense Disambiguation on a Specific Domain
Eneko Agirre | Oier Lopez de Lacalle | Christiane Fellbaum | Shu-Kai Hsieh | Maurizio Tesconi | Monica Monachini | Piek Vossen | Roxanne Segers
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
Kyoto: An Integrated System for Specific Domain WSD
Aitor Soroa | Eneko Agirre | Oier Lopez de Lacalle | Wauter Bosma | Piek Vossen | Monica Monachini | Jessie Lo | Shu-Kai Hsieh
Proceedings of the 5th International Workshop on Semantic Evaluation

2009

pdf bib
SemEval-2010 Task 17: All-words Word Sense Disambiguation on a Specific Domain
Eneko Agirre | Oier Lopez de Lacalle | Christiane Fellbaum | Andrea Marchetti | Antonio Toral | Piek Vossen
Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions (SEW-2009)

pdf bib
Supervised Domain Adaption for WSD
Eneko Agirre | Oier Lopez de Lacalle
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

2008

pdf bib
On Robustness and Domain Adaptation using SVD for Word Sense Disambiguation
Eneko Agirre | Oier Lopez de Lacalle
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2007

pdf bib
SemEval-2007 Task 01: Evaluating WSD on Cross-Language Information Retrieval
Eneko Agirre | Bernardo Magnini | Oier Lopez de Lacalle | Arantxa Otegi | German Rigau | Piek Vossen
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf bib
UBC-ALM: Combining k-NN with SVD for WSD
Eneko Agirre | Oier Lopez de Lacalle
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf bib
UBC-UMB: Combining unsupervised and supervised systems for all-words WSD
David Martinez | Timothy Baldwin | Eneko Agirre | Oier Lopez de Lacalle
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

2006

pdf bib
Two graph-based algorithms for state-of-the-art WSD
Eneko Agirre | David Martínez | Oier López de Lacalle | Aitor Soroa
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

pdf bib
Evaluating and optimizing the parameters of an unsupervised graph-based WSD algorithm
Eneko Agirre | David Martínez | Oier López de Lacalle | Aitor Soroa
Proceedings of TextGraphs: the First Workshop on Graph Based Methods for Natural Language Processing

2004

pdf bib
Publicly Available Topic Signatures for all WordNet Nominal Senses
Eneko Agirre | Oier Lopez de Lacalle
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)