Laura Aina


2019

pdf bib
Putting Words in Context: LSTM Language Models and Lexical Ambiguity
Laura Aina | Kristina Gulordava | Gemma Boleda
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

In neural network models of language, words are commonly represented using context-invariant representations (word embeddings) which are then put in context in the hidden layers. Since words are often ambiguous, representing the contextually relevant information is not trivial. We investigate how an LSTM language model deals with lexical ambiguity in English, designing a method to probe its hidden representations for lexical and contextual information about words. We find that both types of information are represented to a large extent, but also that there is room for improvement for contextual information.

pdf bib
What do Entity-Centric Models Learn? Insights from Entity Linking in Multi-Party Dialogue
Laura Aina | Carina Silberer | Ionut-Teodor Sorodoc | Matthijs Westera | Gemma Boleda
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Humans use language to refer to entities in the external world. Motivated by this, in recent years several models that incorporate a bias towards learning entity representations have been proposed. Such entity-centric models have shown empirical success, but we still know little about why. In this paper we analyze the behavior of two recently proposed entity-centric models in a referential task, Entity Linking in Multi-party Dialogue (SemEval 2018 Task 4). We show that these models outperform the state of the art on this task, and that they do better on lower frequency entities than a counterpart model that is not entity-centric, with the same model size. We argue that making models entity-centric naturally fosters good architectural decisions. However, we also show that these models do not really build entity representations and that they make poor use of linguistic context. These negative results underscore the need for model analysis, to test whether the motivations for particular architectures are borne out in how models behave when deployed.

2018

pdf bib
AMORE-UPF at SemEval-2018 Task 4: BiLSTM with Entity Library
Laura Aina | Carina Silberer | Ionut-Teodor Sorodoc | Matthijs Westera | Gemma Boleda
Proceedings of The 12th International Workshop on Semantic Evaluation

This paper describes our winning contribution to SemEval 2018 Task 4: Character Identification on Multiparty Dialogues. It is a simple, standard model with one key innovation, an entity library. Our results show that this innovation greatly facilitates the identification of infrequent characters. Because of the generic nature of our model, this finding is potentially relevant to any task that requires the effective learning from sparse or imbalanced data.

pdf bib
How to represent a word and predict it, too: Improving tied architectures for language modelling
Kristina Gulordava | Laura Aina | Gemma Boleda
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Recent state-of-the-art neural language models share the representations of words given by the input and output mappings. We propose a simple modification to these architectures that decouples the hidden state from the word embedding prediction. Our architecture leads to comparable or better results compared to previous tied models and models without tying, with a much smaller number of parameters. We also extend our proposal to word2vec models, showing that tying is appropriate for general word prediction tasks.