Ian Lane


2019

pdf bib
Learning Question-Guided Video Representation for Multi-Turn Video Question Answering
Guan-Lin Chao | Abhinav Rastogi | Semih Yavuz | Dilek Hakkani-Tur | Jindong Chen | Ian Lane
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue

Understanding and conversing about dynamic scenes is one of the key capabilities of AI agents that navigate the environment and convey useful information to humans. Video question answering is a specific scenario of such AI-human interaction where an agent generates a natural language response to a question regarding the video of a dynamic scene. Incorporating features from multiple modalities, which often provide supplementary information, is one of the challenging aspects of video question answering. Furthermore, a question often concerns only a small segment of the video, hence encoding the entire video sequence using a recurrent neural network is not computationally efficient. Our proposed question-guided video representation module efficiently generates the token-level video summary guided by each word in the question. The learned representations are then fused with the question to generate the answer. Through empirical evaluation on the Audio Visual Scene-aware Dialog (AVSD) dataset, our proposed models in single-turn and multi-turn question answering achieve state-of-the-art performance on several automatic natural language generation evaluation metrics.

2018

pdf bib
End-to-End Learning of Task-Oriented Dialogs
Bing Liu | Ian Lane
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

In this thesis proposal, we address the limitations of conventional pipeline design of task-oriented dialog systems and propose end-to-end learning solutions. We design neural network based dialog system that is able to robustly track dialog state, interface with knowledge bases, and incorporate structured query results into system responses to successfully complete task-oriented dialog. In learning such neural network based dialog systems, we propose hybrid offline training and online interactive learning methods. We introduce a multi-task learning method in pre-training the dialog agent in a supervised manner using task-oriented dialog corpora. The supervised training agent can further be improved via interacting with users and learning online from user demonstration and feedback with imitation and reinforcement learning. In addressing the sample efficiency issue with online policy learning, we further propose a method by combining the learning-from-user and learning-from-simulation approaches to improve the online interactive learning efficiency.

pdf bib
Adversarial Learning of Task-Oriented Neural Dialog Models
Bing Liu | Ian Lane
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue

In this work, we propose an adversarial learning method for reward estimation in reinforcement learning (RL) based task-oriented dialog models. Most of the current RL based task-oriented dialog systems require the access to a reward signal from either user feedback or user ratings. Such user ratings, however, may not always be consistent or available in practice. Furthermore, online dialog policy learning with RL typically requires a large number of queries to users, suffering from sample efficiency problem. To address these challenges, we propose an adversarial learning method to learn dialog rewards directly from dialog samples. Such rewards are further used to optimize the dialog policy with policy gradient based RL. In the evaluation in a restaurant search domain, we show that the proposed adversarial dialog learning method achieves advanced dialog success rate comparing to strong baseline methods. We further discuss the covariate shift problem in online adversarial dialog learning and show how we can address that with partial access to user feedback.

2016

pdf bib
Joint Online Spoken Language Understanding and Language Modeling With Recurrent Neural Networks
Bing Liu | Ian Lane
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2014

pdf bib
Situated Language Understanding at 25 Miles per Hour
Teruhisa Misu | Antoine Raux | Rakesh Gupta | Ian Lane
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)

2012

pdf bib
A Simulation-based Framework for Spoken Language Understanding and Action Selection in Situated Interaction
David Cohen | Ian Lane
NAACL-HLT Workshop on Future directions and needs in the Spoken Dialog Community: Tools and Data (SDCTD 2012)

pdf bib
HRItk: The Human-Robot Interaction ToolKit Rapid Development of Speech-Centric Interactive Systems in ROS
Ian Lane | Vinay Prasad | Gaurav Sinha | Arlette Umuhoza | Shangyu Luo | Akshay Chandrashekaran | Antoine Raux
NAACL-HLT Workshop on Future directions and needs in the Spoken Dialog Community: Tools and Data (SDCTD 2012)

2010

pdf bib
Tools for Collecting Speech Corpora via Mechanical-Turk
Ian Lane | Matthias Eck | Kay Rottmann | Alex Waibel
Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk

2009

pdf bib
Incremental Adaptation of Speech-to-Speech Translation
Nguyen Bach | Roger Hsiao | Matthias Eck | Paisarn Charoenpornsawat | Stephan Vogel | Tanja Schultz | Ian Lane | Alex Waibel | Alan Black
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

2007

pdf bib
A Log-Linear Block Transliteration Model based on Bi-Stream HMMs
Bing Zhao | Nguyen Bach | Ian Lane | Stephan Vogel
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
Bilingual-LSA Based LM Adaptation for Spoken Language Translation
Yik-Cheung Tam | Ian Lane | Tanja Schultz
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics