Itziar Gonzalez-Dios


2020

pdf bib
LagunTest: A NLP Based Application to Enhance Reading Comprehension
Kepa Bengoetxea | Itziar Gonzalez-Dios | Amaia Aguirregoitia
Proceedings of the 1st Workshop on Tools and Resources to Empower People with REAding DIfficulties (READI)

The ability to read and understand written texts plays an important role in education, above all in the last years of primary education. This is especially pertinent in language immersion educational programmes, where some students have low linguistic competence in the languages of instruction. In this context, adapting the texts to the individual needs of each student requires a considerable effort by education professionals. However, language technologies can facilitate the laborious adaptation of materials in order to enhance reading comprehension. In this paper, we present LagunTest, a NLP based application that takes as input a text in Basque or English, and offers synonyms, definitions, examples of the words in different contexts and presents some linguistic characteristics as well as visualizations. LagunTest is based on reusable and open multilingual and multimodal tools, and it is also distributed with an open license. LagunTest is intended to ease the burden of education professionals in the task of adapting materials, and the output should always be supervised by them.

pdf bib
Proceedings of the LREC 2020 Workshop on Multimodal Wordnets (MMW2020)
Thierry Declerk | Itziar Gonzalez-Dios | German Rigau
Proceedings of the LREC 2020 Workshop on Multimodal Wordnets (MMW2020)

pdf bib
Towards modelling SUMO attributes through WordNet adjectives: a Case Study on Qualities
Itziar Gonzalez-Dios | Javier Alvez | German Rigau
Proceedings of the LREC 2020 Workshop on Multimodal Wordnets (MMW2020)

Previous studies have shown that the knowledge about attributes and properties in the SUMO ontology and its mapping to WordNet adjectives lacks of an accurate and complete characterization. A proper characterization of this type of knowledge is required to perform formal commonsense reasoning based on the SUMO properties, for instance to distinguish one concept from another based on their properties. In this context, we propose a new semi-automatic approach to model the knowledge about properties and attributes in SUMO by exploiting the information encoded in WordNet adjectives and its mapping to SUMO. To that end, we considered clusters of semantically related groups of WordNet adjectival and nominal synsets. Based on these clusters, we propose a new semi-automatic model for SUMO attributes and their mapping to WordNet, which also includes polarity information. In this paper, as an exploratory approach, we focus on qualities.

pdf bib
Exploring the Enrichment of Basque WordNet with a Sentiment Lexicon
Itziar Gonzalez-Dios | Jon Alkorta
Proceedings of the LREC 2020 Workshop on Multimodal Wordnets (MMW2020)

Wordnets are lexical databases where the semantic relations of words and concepts are established. These resources are useful for manyNLP tasks, such as automatic text classification, word-sense disambiguation or machine translation. In comparison with other wordnets,the Basque version is smaller and some PoS are underrepresented or missing e.g. adjectives and adverbs. In this work, we explore anovel approach to enrich the Basque WordNet, focusing on the adjectives. We want to prove the use and and effectiveness of sentimentlexicons to enrich the resource without the need of starting from scratch. Using as complementary resources, one dictionary and thesentiment valences of the words, we check if the word of the lexicon matches with the meaning of the synset, and if it matches we addthe word as variant to the Basque WordNet. Following this methodology, we describe the most frequent adjectives with positive andnegative valence, the matches and the possible solutions for the non-matches.

2018

pdf bib
Cross-checking WordNet and SUMO Using Meronymy
Javier Álvez | Itziar Gonzalez-Dios | German Rigau
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Verbal Multiword Expressions in Basque Corpora
Uxoa Iñurrieta | Itziar Aduriz | Ainara Estarrona | Itziar Gonzalez-Dios | Antton Gurrutxaga | Ruben Urizar | Iñaki Alegria
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

This paper presents a Basque corpus where Verbal Multiword Expressions (VMWEs) were annotated following universal guidelines. Information on the annotation is given, and some ideas for discussion upon the guidelines are also proposed. The corpus is useful not only for NLP-related research, but also to draw conclusions on Basque phraseology in comparison with other languages.

2017

pdf bib
Framework for the Analysis of Simplified Texts Taking Discourse into Account: the Basque Causal Relations as Case Study
Itziar Gonzalez-Dios | Arantza Diaz de Ilarraza | Mikel Iruskieta
Proceedings of the 6th Workshop on Recent Advances in RST and Related Formalisms

2016

pdf bib
A Preliminary Study of Statistically Predictive Syntactic Complexity Features and Manual Simplifications in Basque
Itziar Gonzalez-Dios | María Jesús Aranzabe | Arantza Díaz de Ilarraza
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC)

In this paper, we present a comparative analysis of statistically predictive syntactic features of complexity and the treatment of these features by humans when simplifying texts. To that end, we have used a list of the most five statistically predictive features obtained automatically and the Corpus of Basque Simplified Texts (CBST) to analyse how the syntactic phenomena in these features have been manually simplified. Our aim is to go beyond the descriptions of operations found in the corpus and relate the multidisciplinary findings to understand text complexity from different points of view. We also present some issues that can be important when analysing linguistic complexity.

2014

pdf bib
Making Biographical Data in Wikipedia Readable: A Pattern-based Multilingual Approach
Itziar Gonzalez-Dios | María Jesús Aranzabe | Arantza Díaz de Ilarraza
Proceedings of the Workshop on Automatic Text Simplification - Methods and Applications in the Multilingual Society (ATS-MA 2014)

pdf bib
Simple or Complex? Assessing the readability of Basque Texts
Itziar Gonzalez-Dios | María Jesús Aranzabe | Arantza Díaz de Ilarraza | Haritz Salaberri
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers