Bo Pang


2020

pdf bib
Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation
Bo Pang | Erik Nijkamp | Wenjuan Han | Linqi Zhou | Yixian Liu | Kewei Tu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Open-domain dialogue generation has gained increasing attention in Natural Language Processing. Its evaluation requires a holistic means. Human ratings are deemed as the gold standard. As human evaluation is inefficient and costly, an automated substitute is highly desirable. In this paper, we propose holistic evaluation metrics that capture different aspects of open-domain dialogues. Our metrics consist of (1) GPT-2 based context coherence between sentences in a dialogue, (2) GPT-2 based fluency in phrasing, (3) n-gram based diversity in responses to augmented queries, and (4) textual-entailment-inference based logical self-consistency. The empirical validity of our metrics is demonstrated by strong correlations with human judgments. We open source the code and relevant materials.

pdf bib
Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube
Jack Hessel | Zhenhai Zhu | Bo Pang | Radu Soricut
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Pretraining from unlabelled web videos has quickly become the de-facto means of achieving high performance on many video understanding tasks. Features are learned via prediction of grounded relationships between visual content and automatic speech recognition (ASR) tokens. However, prior pretraining work has been limited to only instructional videos; a priori, we expect this domain to be relatively “easy:” speakers in instructional videos will often reference the literal objects/actions being depicted. We ask: can similar models be trained on more diverse video corpora? And, if so, what types of videos are “grounded” and what types are not? We fit a representative pretraining model to the diverse YouTube8M dataset, and study its success and failure cases. We find that visual-textual grounding is indeed possible across previously unexplored video categories, and that pretraining on a more diverse set results in representations that generalize to both non-instructional and instructional domains.

2019

pdf bib
Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering
Soravit Changpinyo | Bo Pang | Piyush Sharma | Radu Soricut
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Object detection plays an important role in current solutions to vision and language tasks like image captioning and visual question answering. However, popular models like Faster R-CNN rely on a costly process of annotating ground-truths for both the bounding boxes and their corresponding semantic labels, making it less amenable as a primitive task for transfer learning. In this paper, we examine the effect of decoupling box proposal and featurization for down-stream tasks. The key insight is that this allows us to leverage a large amount of labeled annotations that were previously unavailable for standard object detection benchmarks. Empirically, we demonstrate that this leads to effective transfer learning and improved image captioning and visual question answering models, as measured on publicly-available benchmarks.

pdf bib
CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases
Tao Yu | Rui Zhang | Heyang Er | Suyi Li | Eric Xue | Bo Pang | Xi Victoria Lin | Yi Chern Tan | Tianze Shi | Zihan Li | Youxuan Jiang | Michihiro Yasunaga | Sungrok Shim | Tao Chen | Alexander Fabbri | Zifan Li | Luyao Chen | Yuwen Zhang | Shreya Dixit | Vincent Zhang | Caiming Xiong | Richard Socher | Walter Lasecki | Dragomir Radev
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We present CoSQL, a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems. It consists of 30k+ turns plus 10k+ annotated SQL queries, obtained from a Wizard-of-Oz (WOZ) collection of 3k dialogues querying 200 complex DBs spanning 138 domains. Each dialogue simulates a real-world DB query scenario with a crowd worker as a user exploring the DB and a SQL expert retrieving answers with SQL, clarifying ambiguous questions, or otherwise informing of unanswerable questions. When user questions are answerable by SQL, the expert describes the SQL and execution results to the user, hence maintaining a natural interaction flow. CoSQL introduces new challenges compared to existing task-oriented dialogue datasets: (1) the dialogue states are grounded in SQL, a domain-independent executable representation, instead of domain-specific slot value pairs, and (2) because testing is done on unseen databases, success requires generalizing to new domains. CoSQL includes three tasks: SQL-grounded dialogue state tracking, response generation from query results, and user dialogue act prediction. We evaluate a set of strong baselines for each task and show that CoSQL presents significant challenges for future research. The dataset, baselines, and leaderboard will be released at https://yale-lily.github.io/cosql.

pdf bib
A Case Study on Combining ASR and Visual Features for Generating Instructional Video Captions
Jack Hessel | Bo Pang | Zhenhai Zhu | Radu Soricut
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Instructional videos get high-traffic on video sharing platforms, and prior work suggests that providing time-stamped, subtask annotations (e.g., “heat the oil in the pan”) improves user experiences. However, current automatic annotation methods based on visual features alone perform only slightly better than constant prediction. Taking cues from prior work, we show that we can improve performance significantly by considering automatic speech recognition (ASR) tokens as input. Furthermore, jointly modeling ASR tokens and visual features results in higher performance compared to training individually on either modality. We find that unstated background information is better explained by visual features, whereas fine-grained distinctions (e.g., “add oil” vs. “add olive oil”) are disambiguated more easily via ASR tokens.

pdf bib
SParC: Cross-Domain Semantic Parsing in Context
Tao Yu | Rui Zhang | Michihiro Yasunaga | Yi Chern Tan | Xi Victoria Lin | Suyi Li | Heyang Er | Irene Li | Bo Pang | Tao Chen | Emily Ji | Shreya Dixit | David Proctor | Sungrok Shim | Jonathan Kraft | Vincent Zhang | Caiming Xiong | Richard Socher | Dragomir Radev
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We present SParC, a dataset for cross-domainSemanticParsing inContext that consists of 4,298 coherent question sequences (12k+ individual questions annotated with SQL queries). It is obtained from controlled user interactions with 200 complex databases over 138 domains. We provide an in-depth analysis of SParC and show that it introduces new challenges compared to existing datasets. SParC demonstrates complex contextual dependencies, (2) has greater semantic diversity, and (3) requires generalization to unseen domains due to its cross-domain nature and the unseen databases at test time. We experiment with two state-of-the-art text-to-SQL models adapted to the context-dependent, cross-domain setup. The best model obtains an exact match accuracy of 20.2% over all questions and less than10% over all interaction sequences, indicating that the cross-domain setting and the con-textual phenomena of the dataset present significant challenges for future research. The dataset, baselines, and leaderboard are released at https://yale-lily.github.io/sparc.

pdf bib
Neural-based Chinese Idiom Recommendation for Enhancing Elegance in Essay Writing
Yuanchao Liu | Bo Pang | Bingquan Liu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Although the proper use of idioms can enhance the elegance of writing, the active use of various expressions is a challenge because remembering idioms is difficult. In this study, we address the problem of idiom recommendation by leveraging a neural machine translation framework, in which we suppose that idioms are written with one pseudo target language. Two types of real-life datasets are collected to support this study. Experimental results show that the proposed approach achieves promising performance compared with other baseline methods.

2018

pdf bib
Points, Paths, and Playscapes: Large-scale Spatial Language Understanding Tasks Set in the Real World
Jason Baldridge | Tania Bedrax-Weiss | Daphne Luong | Srini Narayanan | Bo Pang | Fernando Pereira | Radu Soricut | Michael Tseng | Yuan Zhang
Proceedings of the First International Workshop on Spatial Language Understanding

Spatial language understanding is important for practical applications and as a building block for better abstract language understanding. Much progress has been made through work on understanding spatial relations and values in images and texts as well as on giving and following navigation instructions in restricted domains. We argue that the next big advances in spatial language understanding can be best supported by creating large-scale datasets that focus on points and paths based in the real world, and then extending these to create online, persistent playscapes that mix human and bot players, where the bot players must learn, evolve, and survive according to their depth of understanding of scenes, navigation, and interactions.

2014

pdf bib
The effect of wording on message propagation: Topic- and author-controlled natural experiments on Twitter
Chenhao Tan | Lillian Lee | Bo Pang
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Alessandro Moschitti | Bo Pang | Walter Daelemans
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2012

pdf bib
Spice it up? Mining Refinements to Online Instructions from User Generated Content
Gregory Druck | Bo Pang
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Revisiting the Predictability of Language: Response Completion in Social Media
Bo Pang | Sujith Ravi
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf bib
Personalized Recommendation of User Comments via Factor Models
Deepak Agarwal | Bee-Chung Chen | Bo Pang
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Search in the Lost Sense of “Query”: Question Formulation in Web Search Queries and its Temporal Changes
Bo Pang | Ravi Kumar
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
For the sake of simplicity: Unsupervised extraction of lexical simplifications from Wikipedia
Mark Yatskar | Bo Pang | Cristian Danescu-Niculescu-Mizil | Lillian Lee
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2009

pdf bib
Matching Reviews to Objects using a Language Model
Nilesh Dalvi | Ravi Kumar | Bo Pang | Andrew Tomkins
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
For a few dollars less: Identifying review pages sans human labels
Luciano Barbosa | Ravi Kumar | Bo Pang | Andrew Tomkins
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2008

pdf bib
Introduction to Computational Advertising
Evgeniy Gabrilovich | Vanja Josifovski | Bo Pang
Tutorial Abstracts of ACL-08: HLT

pdf bib
Using Very Simple Statistics for Review Search: An Exploration
Bo Pang | Lillian Lee
Coling 2008: Companion volume: Posters

2006

pdf bib
Get out the vote: Determining support or opposition from Congressional floor-debate transcripts
Matt Thomas | Bo Pang | Lillian Lee
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

pdf bib
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Doctoral Consortium
Matt Huenerfauth | Bo Pang | Mitch Marcus
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Doctoral Consortium

2005

pdf bib
Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales
Bo Pang | Lillian Lee
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

pdf bib
A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts
Bo Pang | Lillian Lee
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

2003

pdf bib
Syntax-based Alignment of Multiple Translations: Extracting Paraphrases and Generating New Sentences
Bo Pang | Kevin Knight | Daniel Marcu
Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics

2002

pdf bib
Thumbs up? Sentiment Classification using Machine Learning Techniques
Bo Pang | Lillian Lee | Shivakumar Vaithyanathan
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)