Kim Anh Nguyen


2018

pdf bib
Introducing Two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness
Kim Anh Nguyen | Sabine Schulte im Walde | Ngoc Thang Vu
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

We present two novel datasets for the low-resource language Vietnamese to assess models of semantic similarity: ViCon comprises pairs of synonyms and antonyms across word classes, thus offering data to distinguish between similarity and dissimilarity. ViSim-400 provides degrees of similarity across five semantic relations, as rated by human judges. The two datasets are verified through standard co-occurrence and neural network models, showing results comparable to the respective English datasets.

pdf bib
Integrating Predictions from Neural-Network Relation Classifiers into Coreference and Bridging Resolution
Ina Roesiger | Maximilian Köper | Kim Anh Nguyen | Sabine Schulte im Walde
Proceedings of the First Workshop on Computational Models of Reference, Anaphora and Coreference

Cases of coreference and bridging resolution often require knowledge about semantic relations between anaphors and antecedents. We suggest state-of-the-art neural-network classifiers trained on relation benchmarks to predict and integrate likelihoods for relations. Two experiments with representations differing in noise and complexity improve our bridging but not our coreference resolver.

pdf bib
Dave the debater: a retrieval-based and generative argumentative dialogue agent
Dieu Thu Le | Cam-Tu Nguyen | Kim Anh Nguyen
Proceedings of the 5th Workshop on Argument Mining

In this paper, we explore the problem of developing an argumentative dialogue agent that can be able to discuss with human users on controversial topics. We describe two systems that use retrieval-based and generative models to make argumentative responses to the users. The experiments show promising results although they have been trained on a small dataset.

2017

pdf bib
Distinguishing Antonyms and Synonyms in a Pattern-based Neural Network
Kim Anh Nguyen | Sabine Schulte im Walde | Ngoc Thang Vu
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Distinguishing between antonyms and synonyms is a key task to achieve high performance in NLP systems. While they are notoriously difficult to distinguish by distributional co-occurrence models, pattern-based methods have proven effective to differentiate between the relations. In this paper, we present a novel neural network model AntSynNET that exploits lexico-syntactic patterns from syntactic parse trees. In addition to the lexical and syntactic information, we successfully integrate the distance between the related words along the syntactic path as a new pattern feature. The results from classification experiments show that AntSynNET improves the performance over prior pattern-based methods.

pdf bib
Hierarchical Embeddings for Hypernymy Detection and Directionality
Kim Anh Nguyen | Maximilian Köper | Sabine Schulte im Walde | Ngoc Thang Vu
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We present a novel neural model HyperVec to learn hierarchical embeddings for hypernymy detection and directionality. While previous embeddings have shown limitations on prototypical hypernyms, HyperVec represents an unsupervised measure where embeddings are learned in a specific order and capture the hypernym–hyponym distributional hierarchy. Moreover, our model is able to generalize over unseen hypernymy pairs, when using only small sets of training data, and by mapping to other languages. Results on benchmark datasets show that HyperVec outperforms both state-of-the-art unsupervised measures and embedding models on hypernymy detection and directionality, and on predicting graded lexical entailment.

2016

pdf bib
Neural-based Noise Filtering from Word Embeddings
Kim Anh Nguyen | Sabine Schulte im Walde | Ngoc Thang Vu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Word embeddings have been demonstrated to benefit NLP tasks impressively. Yet, there is room for improvements in the vector representations, because current word embeddings typically contain unnecessary information, i.e., noise. We propose two novel models to improve word embeddings by unsupervised learning, in order to yield word denoising embeddings. The word denoising embeddings are obtained by strengthening salient information and weakening noise in the original word embeddings, based on a deep feed-forward neural network filter. Results from benchmark tasks show that the filtered word denoising embeddings outperform the original word embeddings.

pdf bib
Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction
Kim Anh Nguyen | Sabine Schulte im Walde | Ngoc Thang Vu
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)