Mariona Coll Ardanuy

Also published as: Mariona Coll Ardanuy


2020

pdf bib
Living Machines: A study of atypical animacy
Mariona Coll Ardanuy | Federico Nanni | Kaspar Beelen | Kasra Hosseini | Ruth Ahnert | Jon Lawrence | Katherine McDonough | Giorgia Tolfo | Daniel CS Wilson | Barbara McGillivray
Proceedings of the 28th International Conference on Computational Linguistics

This paper proposes a new approach to animacy detection, the task of determining whether an entity is represented as animate in a text. In particular, this work is focused on atypical animacy and examines the scenario in which typically inanimate objects, specifically machines, are given animate attributes. To address it, we have created the first dataset for atypical animacy detection, based on nineteenth-century sentences in English, with machines represented as either animate or inanimate. Our method builds on recent innovations in language modeling, specifically BERT contextualized word embeddings, to better capture fine-grained contextual properties of words. We present a fully unsupervised pipeline, which can be easily adapted to different contexts, and report its performance on an established animacy dataset and our newly introduced resource. We show that our method provides a substantially more accurate characterization of atypical animacy, especially when applied to highly complex forms of language use.

pdf bib
DeezyMatch: A Flexible Deep Learning Approach to Fuzzy String Matching
Kasra Hosseini | Federico Nanni | Mariona Coll Ardanuy
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

We present DeezyMatch, a free, open-source software library written in Python for fuzzy string matching and candidate ranking. Its pair classifier supports various deep neural network architectures for training new classifiers and for fine-tuning a pretrained model, which paves the way for transfer learning in fuzzy string matching. This approach is especially useful where only limited training examples are available. The learned DeezyMatch models can be used to generate rich vector representations from string inputs. The candidate ranker component in DeezyMatch uses these vector representations to find, for a given query, the best matching candidates in a knowledge base. It uses an adaptive searching algorithm applicable to large knowledge bases and query sets. We describe DeezyMatch’s functionality, design and implementation, accompanied by a use case in toponym matching and candidate ranking in realistic noisy datasets.

2017

pdf bib
Proceedings of the Student Research Workshop at the 15th Conference of the European Chapter of the Association for Computational Linguistics
Florian Kunneman | Uxoa Iñurrieta | John J. Camilleri | Mariona Coll Ardanuy
Proceedings of the Student Research Workshop at the 15th Conference of the European Chapter of the Association for Computational Linguistics

2016

pdf bib
You Shall Know People by the Company They Keep: Person Name Disambiguation for Social Network Construction
Mariona Coll Ardanuy | Maarten van den Bos | Caroline Sporleder
Proceedings of the 10th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities

2014

pdf bib
Structure-based Clustering of Novels
Mariona Coll Ardanuy | Caroline Sporleder
Proceedings of the 3rd Workshop on Computational Linguistics for Literature (CLFL)