Ngoc Phuoc An Vo


2020

pdf bib
LiMiT: The Literal Motion in Text Dataset
Irene Manotas | Ngoc Phuoc An Vo | Vadim Sheinin
Findings of the Association for Computational Linguistics: EMNLP 2020

Motion recognition is one of the basic cognitive capabilities of many life forms, yet identifying motion of physical entities in natural language have not been explored extensively and empirically. We present the Literal-Motion-in-Text (LiMiT) dataset, a large human-annotated collection of English text sentences describing physical occurrence of motion, with annotated physical entities in motion. We describe the annotation process for the dataset, analyze its scale and diversity, and report results of several baseline models. We also present future research directions and applications of the LiMiT dataset and share it publicly as a new resource for the research community.

pdf bib
Identifying Motion Entities in Natural Language and A Case Study for Named Entity Recognition
Ngoc Phuoc An Vo | Irene Manotas | Vadim Sheinin | Octavian Popescu
Proceedings of the 28th International Conference on Computational Linguistics

Motion recognition is one of the basic cognitive capabilities of many life forms, however, detecting and understanding motion in text is not a trivial task. In addition, identifying motion entities in natural language is not only challenging but also beneficial for a better natural language understanding. In this paper, we present a Motion Entity Tagging (MET) model to identify entities in motion in a text using the Literal-Motion-in-Text (LiMiT) dataset for training and evaluating the model. Then we propose a new method to split clauses and phrases from complex and long motion sentences to improve the performance of our MET model. We also present results showing that motion features, in particular, entity in motion benefits the Named-Entity Recognition (NER) task. Finally, we present an analysis for the special co-occurrence relation between the person category in NER and animate entities in motion, which significantly improves the classification performance for the person category in NER.

2018

pdf bib
A Large Resource of Patterns for Verbal Paraphrases
Octavian Popescu | Ngoc Phuoc An Vo | Vadim Sheinin
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
QUEST: A Natural Language Interface to Relational Databases
Vadim Sheinin | Elahe Khorashani | Hangu Yeo | Kun Xu | Ngoc Phuoc An Vo | Octavian Popescu
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf bib
Corpora for Learning the Mutual Relationship between Semantic Relatedness and Textual Entailment
Ngoc Phuoc An Vo | Octavian Popescu
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper we present the creation of a corpora annotated with both semantic relatedness (SR) scores and textual entailment (TE) judgments. In building this corpus we aimed at discovering, if any, the relationship between these two tasks for the mutual benefit of resolving one of them by relying on the insights gained from the other. We considered a corpora already annotated with TE judgments and we proceed to the manual annotation with SR scores. The RTE 1-4 corpora used in the PASCAL competition fit our need. The annotators worked independently of one each other and they did not have access to the TE judgment during annotation. The intuition that the two annotations are correlated received major support from this experiment and this finding led to a system that uses this information to revise the initial estimates of SR scores. As semantic relatedness is one of the most general and difficult task in natural language processing we expect that future systems will combine different sources of information in order to solve it. Our work suggests that textual entailment plays a quantifiable role in addressing it.

pdf bib
DISCO: A System Leveraging Semantic Search in Document Review
Ngoc Phuoc An Vo | Fabien Guillot | Caroline Privault
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

This paper presents Disco, a prototype for supporting knowledge workers in exploring, reviewing and sorting collections of textual data. The goal is to facilitate, accelerate and improve the discovery of information. To this end, it combines Semantic Relatedness techniques with a review workflow developed in a tangible environment. Disco uses a semantic model that is leveraged on-line in the course of search sessions, and accessed through natural hand-gesture, in a simple and intuitive way.

2015

pdf bib
Learning the Impact of Machine Translation Evaluation Metrics for Semantic Textual Similarity
Simone Magnolini | Ngoc Phuoc An Vo | Octavian Popescu
Proceedings of the International Conference Recent Advances in Natural Language Processing

pdf bib
Learning the Impact and Behavior of Syntactic Structure: A Case Study in Semantic Textual Similarity
Ngoc Phuoc An Vo | Octavian Popescu
Proceedings of the International Conference Recent Advances in Natural Language Processing

pdf bib
Paraphrase Identification and Semantic Similarity in Twitter with Simple Features
Ngoc Phuoc An Vo | Simone Magnolini | Octavian Popescu
Proceedings of the third International Workshop on Natural Language Processing for Social Media

pdf bib
A Preliminary Evaluation of the Impact of Syntactic Structure in Semantic Textual Similarity and Semantic Relatedness Tasks
Ngoc Phuoc An Vo | Octavian Popescu
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

pdf bib
FBK-HLT: An Effective System for Paraphrase Identification and Semantic Similarity in Twitter
Ngoc Phuoc An Vo | Simone Magnolini | Octavian Popescu
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
FBK-HLT: A New Framework for Semantic Textual Similarity
Ngoc Phuoc An Vo | Simone Magnolini | Octavian Popescu
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
FBK-HLT: An Application of Semantic Textual Similarity for Answer Selection in Community Question Answering
Ngoc Phuoc An Vo | Simone Magnolini | Octavian Popescu
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib
FBK-TR: Applying SVM with Multiple Linguistic Features for Cross-Level Semantic Similarity
Ngoc Phuoc An Vo | Tommaso Caselli | Octavian Popescu
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
FBK-TR: SVM for Semantic Relatedeness and Corpus Patterns for RTE
Ngoc Phuoc An Vo | Octavian Popescu | Tommaso Caselli
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
Fast and Accurate Misspelling Correction in Large Corpora
Octavian Popescu | Ngoc Phuoc An Vo
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)