Jonathan Francis


pdf bib
Towards Generalizable Neuro-Symbolic Systems for Commonsense Question Answering
Kaixin Ma | Jonathan Francis | Quanyang Lu | Eric Nyberg | Alessandro Oltramari
Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing

Non-extractive commonsense QA remains a challenging AI task, as it requires systems to reason about, synthesize, and gather disparate pieces of information, in order to generate responses to queries. Recent approaches on such tasks show increased performance, only when models are either pre-trained with additional information or when domain-specific heuristics are used, without any special consideration regarding the knowledge resource type. In this paper, we perform a survey of recent commonsense QA methods and we provide a systematic analysis of popular knowledge resources and knowledge-integration methods, across benchmarks from multiple commonsense datasets. Our results and analysis show that attention-based injection seems to be a preferable choice for knowledge integration and that the degree of domain overlap, between knowledge bases and datasets, plays a crucial role in determining model success.


pdf bib
How Would You Say It? Eliciting Lexically Diverse Dialogue for Supervised Semantic Parsing
Abhilasha Ravichander | Thomas Manzini | Matthias Grabmair | Graham Neubig | Jonathan Francis | Eric Nyberg
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue

Building dialogue interfaces for real-world scenarios often entails training semantic parsers starting from zero examples. How can we build datasets that better capture the variety of ways users might phrase their queries, and what queries are actually realistic? Wang et al. (2015) proposed a method to build semantic parsing datasets by generating canonical utterances using a grammar and having crowdworkers paraphrase them into natural wording. A limitation of this approach is that it induces bias towards using similar language as the canonical utterances. In this work, we present a methodology that elicits meaningful and lexically diverse queries from users for semantic parsing tasks. Starting from a seed lexicon and a generative grammar, we pair logical forms with mixed text-image representations and ask crowdworkers to paraphrase and confirm the plausibility of the queries that they generated. We use this method to build a semantic parsing dataset from scratch for a dialog agent in a smart-home simulation. We find evidence that this dataset, which we have named SmartHome, is demonstrably more lexically diverse and difficult to parse than existing domain-specific semantic parsing datasets.