Frame Identification as Categorization: Exemplars vs Prototypes in Embeddingland
Jennifer Sikos | Sebastian Padó
Proceedings of the 13th International Conference on Computational Semantics - Long Papers
Categorization is a central capability of human cognition, and a number of theories have been developed to account for properties of categorization. Even though many tasks in semantics also involve categorization of some kind, theories of categorization do not play a major role in contemporary research in computational linguistics. This paper follows the idea that embedding-based models of semantics lend themselves well to being formulated in terms of classical categorization theories. The benefit is a space of model families that enables (a) the formulation of hypotheses about the impact of major design decisions, and (b) a transparent assessment of these decisions. We instantiate this idea on the task of frame-semantic frame identification. We define four models that cross two design variables: (a) the choice of prototype vs. exemplar categorization, corresponding to different degrees of generalization applied to the input; and (b) the presence vs. absence of a fine-tuning step, corresponding to generic vs. task-adaptive categorization. We find that for frame identification, generalization and task-adaptive categorization both yield substantial benefits. Our prototype-based, fine-tuned model, which combines the best choices for these variables, establishes a new state of the art in frame identification.
Lexical resources such as WordNet (Miller, 1995) and FrameNet (Baker et al., 1998) are organized as graphs, where relationships between words are made explicit via the structure of the resource. This work explores how structural information from these lexical resources can lead to gains in a downstream task, namely frame identification. While much of the current work in frame identification uses various neural architectures to predict frames, those neural architectures only use representations of frames based on annotated corpus data. We demonstrate how incorporating knowledge directly from the FrameNet graph structure improves the performance of a neural network-based frame identification system. Specifically, we construct a bidirectional LSTM with a loss function that incorporates various graph- and corpus-based frame embeddings for learning and ultimately achieves strong performance gains with the graph-based embeddings over corpus-based embeddings alone.
Much interest in Frame Semantics is fueled by the substantial extent of its applicability across languages. At the same time, lexicographic studies have found that the applicability of individual frames can be diminished by cross-lingual divergences regarding polysemy, syntactic valency, and lexicalization. Due to the large effort involved in manual investigations, there are so far no broad-coverage resources with “problematic” frames for any language pair. Our study investigates to what extent multilingual vector representations of frames learned from manually annotated corpora can address this need by serving as a wide coverage source for such divergences. We present a case study for the language pair English — German using the FrameNet and SALSA corpora and find that inferences can be made about cross-lingual frame applicability using a vector space model.