Charles Chen, Jr.

Also published as: Charles Chen, Charles Chen Jr.


2020

pdf bib
Task-Oriented Dialogue as Dataflow Synthesis
Jacob Andreas | John Bufe | David Burkett | Charles Chen | Josh Clausman | Jean Crawford | Kate Crim | Jordan DeLoach | Leah Dorner | Jason Eisner | Hao Fang | Alan Guo | David Hall | Kristin Hayes | Kellie Hill | Diana Ho | Wendy Iwaszuk | Smriti Jha | Dan Klein | Jayant Krishnamurthy | Theo Lanman | Percy Liang | Christopher H. Lin | Ilya Lintsbakh | Andy McGovern | Aleksandr Nisnevich | Adam Pauls | Dmitrij Petters | Brent Read | Dan Roth | Subhro Roy | Jesse Rusak | Beth Short | Div Slomin | Ben Snyder | Stephon Striplin | Yu Su | Zachary Tellman | Sam Thomson | Andrei Vorobev | Izabela Witoszko | Jason Wolfe | Abby Wray | Yuchen Zhang | Alexander Zotov
Transactions of the Association for Computational Linguistics, Volume 8

We describe an approach to task-oriented dialogue in which dialogue state is represented as a dataflow graph. A dialogue agent maps each user utterance to a program that extends this graph. Programs include metacomputation operators for reference and revision that reuse dataflow fragments from previous turns. Our graph-based state enables the expression and manipulation of complex user intents, and explicit metacomputation makes these intents easier for learned models to predict. We introduce a new dataset, SMCalFlow, featuring complex dialogues about events, weather, places, and people. Experiments show that dataflow graphs and metacomputation substantially improve representability and predictability in these natural dialogues. Additional experiments on the MultiWOZ dataset show that our dataflow representation enables an otherwise off-the-shelf sequence-to-sequence model to match the best existing task-specific state tracking model. The SMCalFlow dataset, code for replicating experiments, and a public leaderboard are available at https://www.microsoft.com/en-us/research/project/dataflow-based-dialogue-semantic-machines.

2019

pdf bib
Context Dependent Semantic Parsing over Temporally Structured Data
Charles Chen | Razvan Bunescu
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

We describe a new semantic parsing setting that allows users to query the system using both natural language questions and actions within a graphical user interface. Multiple time series belonging to an entity of interest are stored in a database and the user interacts with the system to obtain a better understanding of the entity’s state and behavior, entailing sequences of actions and questions whose answers may depend on previous factual or navigational interactions. We design an LSTM-based encoder-decoder architecture that models context dependency through copying mechanisms and multiple levels of attention over inputs and previous outputs. When trained to predict tokens using supervised learning, the proposed architecture substantially outperforms standard sequence generation baselines. Training the architecture using policy gradient leads to further improvements in performance, reaching a sequence-level accuracy of 88.7% on artificial data and 74.8% on real data.

2017

pdf bib
An Exploration of Data Augmentation and RNN Architectures for Question Ranking in Community Question Answering
Charles Chen | Razvan Bunescu
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

The automation of tasks in community question answering (cQA) is dominated by machine learning approaches, whose performance is often limited by the number of training examples. Starting from a neural sequence learning approach with attention, we explore the impact of two data augmentation techniques on question ranking performance: a method that swaps reference questions with their paraphrases, and training on examples automatically selected from external datasets. Both methods are shown to lead to substantial gains in accuracy over a strong baseline. Further improvements are obtained by changing the model architecture to mirror the structure seen in the data.

2014

pdf bib
Focusing on a Subset of Scripts Enhances the Learning Efficiency of Second Language Writing System
Ching-Pong Au | Yuk-Man Cheung | Charles Chen Jr.
Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing